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~-— (57) Abstract: Second promoter for mouse and human utrophin genes. The promoters or fragments and derivatives may be used to 
control transcription of heterologous sequences, including coding sequences of reporter genes. Expression systems such as host cells 
containing nucleic acid constructs which comprise a promoter as provided operably linked to a heterologous sequence may be used 

Q to screen substances for ability to modulate activity of the utrophin promoter. Substances with such ability may be manufactured 
and/or used in the preparation of compositions such as medicaments. Up-regulation of utrophin expression may compensate for 
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UTROPHIN GENE PROMOTER 

The present invention is based on cloning of a genomic 
promoter region of the human utrophin gene and of the mouse 
utrophin gene. 

The severe muscle wasting disorders Duchenne muscular 
dystrophy (DMD) and the less debilitating Becker muscular 
dystrophy (BMD) are due to mutations in the dystrophin gene 
resulting in a lack of dystrophin or abnormal expression of 
truncated forms of dystrophin, respectively. Dystrophin is a 
large cytoskeletal protein (427kDa with a length of I25nm) 
which in muscle is located at the cytoplasmic surface of the 
sarcolemma, the neuromuscular junction (NMJ) and myotendinous 
junction (MTJ) . It binds to a complex of proteins and 
glycoproteins spanning the sarcolemma called the dystrophin 
associated glycoprotein complex (DGC) . The breakdown of the 
integrity of this complex due to loss of, or impairment of 
dystrophin function, leads to muscle degeneration and the DMD 
phenotype . 

The dystrophin gene is the largest gene so far identified in 
man, covering over 2.7 megabases and containing. 79 exons . The 
corresponding 14kb dystrophin mRNA is expressed predominantly 
in skeletal, cardiac and smooth muscle with lower levels in 
brain. Transcription of dystrophin in different tissues is 
regulated from either the brain promoter (predominantly active 
in neuronal cells) or muscle promoter (differentiated myogenic 
cells, and primary glial cells) giving rise to differing first 
exons. A third promoter between the muscle promoter and the 
second exon of dystrophin regulates expression in cerebellar 
Purkinje neurons. Recently reviewed in (Tinsley, et al (1994) 
Proc Natl Acad Sci if S A 91, 8307-13, Blake, et al (1994) 
Trends in Cell Biol. 4: 19-23 , Tinsley , et al (1993) Curr Opin 
Genet Dev. 3 t 484-90) . 
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There are various approaches which have been adopted for the 
gene therapy of DMD, using the mdx mouse as a model system. 
However, there are considerable problems related to the number 
of muscle cells that can be made dystrophin positive, the 
5 levels of expression of the gene and the duration of 

expression (Partridge, et al. (1995) British Medical Bulletin 
51: 123-137) . It has also become apparent that simply re- 
introducing genes expressing the dystrophin carboxy- terminus 
has no effect on the dystrophic phenotype although the DGC 
10 appears to be re-established at the sarcolemma (Cox, et al . 

(1994) .Mature Genet 8: 333-339 ,Greenberg, et al. (1994) Mature 
Genet 8: 340-344) . 

In order to circumvent some of these problems, possibilities 
of compensating for dystrophin loss using a related protein, 

15 utrophin, are being explored as an alternative route to 

dystrophin gene therapy. A similar strategy is currently 
being evaluated in clinical trials to up- regulate foetal 
haemoglobin to compensate for the affected adult -globin chains 
in patients with sickle cell anaemia (Rodgers, et al. (1993) N 

20 Engl J Med. 328: 73-80, Perrine, et al . (1993) N Engl J Med. 
328: 81-86) . 

Utrophin is a 395kDa protein encoded by multiexonic 1Mb UTKN 
gene located on chromosome 6q24 (Pearce, et al . (1993) Hum Mol 
Gene. 2: 1765-1772). At present the tissue regulation of 

25 utrophin is not fully understood. In the dystrophin deficient 
mdx mouse, utrophin levels in muscle remain elevated soon 
after birth compared with normal mice,- once the utrophin 
levels have decreased to the adult levels (about 1 week after 
birth) , the first signs of muscle fibre necrosis are detected. 

30 However there is evidence to suggest that in the small calibre 
muscles, continual increased levels of utrophin can interact 
with the DGC complex (or an antigenically related complex) at 
the sarcolemma thus preventing loss of the complex with the 
result that these muscles appear normal. There is also a 
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substantial body of evidence demonstrating that utrophin is 
capable of localising to the sarcolemma in normal muscle. 
During fetal muscle development there is increased utrophin 
expression, localised to the sarcolemma, up until 18 weeks in 
the human and 20 days gestation in the mouse. After this time 
the utrophin sarcolemmal staining steadily decreases to the 
significantly lower adult levels shortly before birth where 
utrophin is localised almost exclusively to the NMJ. The 
decrease in utrophin expression coincides with increased 
expression of dystrophin. See reviews (Ibraghimov 
Beskrovnaya, efc al . (1992) .Mature 355, 696-702 ., Blake, et al. 
(1994) Trends in Cell Biol,. 4: 19-23 , Tinsley, et al . (1993) 
Curr Opin Genet Dev. 3: 484-90). 

Thus, in certain circumstances utrophin can localise to the 
sarcolemma probably at the same binding sites as dystrophin, 
through interactions with actin and the DGC. Accordingly, if 
expression of utrophin is sufficiently elevated, it may 
maintain the DGC and thus alleviate muscle degeneration in 
DMD/BMD patients (Tinsley, et al. (1993) Neuromuscul Disord 3, 
537-9.) . 

However, manipulation of utrophin expression and screening for 
molecules able to upregulate expression is hampered by the 
limited understanding of utrophin expression regulation and 
its promoters. We have previously isolated a promoter element 
lying within the CpG island at the 5' end of the utrophin 
locus that is active in a broad range of cell types and 
tissues, and shown it to be synaptically regulated in vivo 
(Dennis, et al. (1996) Nucleic Acids Res 24, 1646-52 and WO 
96/34101) . The sequence contains a consensus N-box, a 6bp 
motif important in the regulation of other genes expressed at 
the NMJ (Koike, et al. (1995) Proc Natl Acad Sci USA 92, 
10624-10628) . Localisation of utrophin at the NMJ in mature 
muscle is partially attributable to enhanced transcription of 
utrophin at sub- junctional myonuclei, with consequent synaptic 
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accumulation of mRNA (Gramolini, et al. (1997) J Biol Chem 
272, 8117-20, Vater, et al. (1998) Molecular and Cellular 
Neuroscience 10, 229-242) . The utrophin promoter drives 
synaptic transcription of a reporter gene in vivo; this 
expression pattern is abolished by point mutations within the 
N-box (Gramolin, et al. (1998) J Biol Chem 273, 736-43). 

The present inventors hypothesised that utrophin might be 
transcribed from more than one promoter, an important 
consideration for the following reasons: First, it may be 
undesirable to interfere with the mechanisms underlying 
synaptic regulation of genes, as this might affect expression 
of other post -synaptic components and impair the structure and 
function of the NMJ; a promoter without synaptic regulatory 
elements might be a more suitable target for pharmacological 
manipulation. Second, cardiac dysfunction is a common feature 
of the dystrophinopathies (Hoogerwaard, et al. (1997) J Neurol 
244, 657-63, Sasaki, etal. (1998) Am Heart J 135, 937-44); if 
the cardiac utrophin message was transcribed from a different 
promoter, then it might prove necessary to up-regulate this. 
Finally, inclusion of additional regulatory sequences might 
increase the yield of a screening program to identify small 
molecules capable of transcriptional activation of utrophin. 

We have now identified an alternative promoter lying within 
the large second intron of the utrophin gene, 50kb 3' to exon 
2. The promoter is highly regulated, expressed in a wide range 
of tissues and has little similarity to the synaptically 
expressed promoter. This promoter drives transcription of a 
widely expressed unique first exon that splices into a common 
full-length mRNA at exon 3. This unique exon (called exon IB) 
encodes a novel 31 amino acid N- terminus for the utrophin 
protein which may be involved in binding to the muscle 
membrane. The sequences of the two utrophin promoters are 
dissimilar, and we predict that they respond to discrete sets 
of cellular signals . 
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Exon IB is primarily considered herein to encode the indicated 
31 amino acids. However, the splice occurs within a codon for 
aspartate. This aspartate residue is common to both isoforms 
of utrophin. In embodiments of the invention an aspartate 
residue may be included C- terminal to the 31 amino acids to 
provide a 32 amino acid peptide, which may be joined to 
additional amino acids, for instance additional utrophin 
sequence as discussed- See, for instance, Figure 8 for one 
embodiment . 

These findings significantly contribute to the understanding 
of the molecular physiology of utrophin expression and are 
important because the promoter reported here provides an 
alternative target for transcriptional activation of utrophin 
in DMD muscle. This promoter does not contain synaptic 
regulatory elements and might, therefore, be a more suitable 
target for pharmacological manipulation than the previously 
described promoter. 

We have now cloned this alternative utrophin promoter and 
exon, and the present invention in various aspects and 
embodiments is based on the sequence information obtained and 
provided herein. 

One major use of the promoter is in screening for substances 
able to modulate its activity. It is well known that 
pharmaceutical research leading to the identification of a new 
drug generally involves the screening of very large numbers of 
candidate substances, both before and even after a lead 
compound has been found. This is one factor which makes 
pharmaceutical research very expensive and time-consuming. A 
method or means assisting in the screening process will have 
considerable commercial importance and utility. Substances 
identified as upregulators of the utrophin promoter represent 
an advance in the fight against muscular dystrophy since they 
provide basis for design and investigation of therapeutics for 
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in vivo use. 

In one aspect, the present invention provides an isolated 
nucleic acid comprising a promoter, the promoter comprising a 
sequence of nucleotides shown in Figure 1 or Figure 2 . The 
5 promoter may comprise one or more fragments of the sequence 
shown in Figure 1 of Figure 2 sufficient to promote gene 
expression. The promoter may comprise or consist essentially 
of a sequence of nucleotides 5' to position 1440 in Figure 1 
(human) or position 1183 in Figure 2 (mouse) . Preferably the 
10 promoter comprises or consists essentially of nucleotides 1199 
to 1440 of the human sequence shown in Figure 1, or the 
equivalent sequence in mouse, e.g. nucleotides 959 to 1183 of 
Figure 2 . 

An even smaller portion of this part of the sequences shown in 
15 Figure 1 of Figure 2 may be used as long as promoter activity 
is retained. Restriction enzymes or nucleases may be used to 
digest the nucleic acid, followed by an appropriate assay (for 
example as illustrated herein using luciferase constructs) to 
determine the minimal sequence required. A preferred 
20 embodiment of the present invention provides a nucleic acid 

isolate with the minimal nucleotide sequence shown in Figure 1 
or Figure 2 required for promoter activity. The minimal 
promoter element is situated between the PvuII restriction 
site at position 1199 in the human sequence and the 
25 transcription start site at 1440 bp in the human sequence and 
between nucleotides 959 to 1183 in the mouse sequence (see 
Figure 2) . 

In one embodiment a promoter according to the present 
Invention comprises or consists of sequence that is shown in 
30 Figure 3 to be conserved between the human and mouse 
sequences, e.g. the 25 nucleotide sequence: 

ACAGGACATCCCAGTGTGCAGTTCG spanning the transcriptional start 
site. 
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The promoter may comprise one or more sequence motifs or 
elements conferring developmental and/or tissue-specific 
regulatory control of expression. For instance, the promoter 
may comprise a sequence for muscle-specific expression, e.g. 
an E-box element/myoD binding site, such as CANNTG, preferably 
CAGGTG . 

Other regulatory sequences may be included, for instance as 
identified by mutation or digest assay in an appropriate 
expression system or by sequence comparison with available 
information, e.g. using a computer to search on-line 
databases . 

By "promoter" is meant a sequence of nucleotides from which 
transcription may be initiated of DNA operably linked 
downstream (i.e. in the 3' direction on the sense strand of 
double -stranded DNA) . 

"Operably linked" means joined as part of the same nucleic 
acid molecule, suitably positioned and oriented for 
transcription to be initiated from the promoter. DNA operably 
linked to a promoter is "under transcriptional initiation 
regulation" of the promoter. 

The present invention extends to a promoter which has a 
nucleotide sequence which is allele, mutant, variant or 
derivative, by way of nucleotide addition, insertion, 
substitution or deletion of a promoter sequence as provided 
herein. Systematic or random mutagenesis of nucleic acid to 
make an alteration to the nucleotide sequence may be performed 
using any technique known to those skilled in the art . One or 
more alterations to a promoter sequence according to the 
present invention may increase or decrease promoter activity, 
or increase or decrease the magnitude of the effect of a 
substance able to modulate the promoter activity. 
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"Promoter activity" is used to refer to ability to initiate 
transcription. The level of promoter activity is quantifiable 
for instance by assessment of the amount of mRNA produced by 
transcription from the promoter or by assessment of the amount 
5 of protein product produced by translation of mRNA produced by 
transcription from the promoter. The amount of a specific 
mRNA present in an expression system may be determined for 
example using specific oligonucleotides which are able to 
hybridise with the mRNA and which are labelled or may be used 
10 in a specific amplification reaction such as the polymerase 
chain reaction. Use of a reporter gene as discussed further 
below facilitates determination of promoter activity by 
reference to protein production. 

In various embodiments of the present invention a promoter 
which has a sequence that is a fragment, mutant, allele, 
derivative or variant, by way of addition, insertion, deletion 
or substitution of one or more nucleotides, of the sequence of 
either the human or the mouse promoters shown in Figures 1 and 
2, respectively, has at least about 60% homology with one or 
both of the shown sequences, preferably at least about 70% 
homology, more preferably at least about 80% homology, more 
preferably at least about 90% homology, more preferably* at 
least about 95% homology. The sequence in accordance with an 
embodiment of the invention may hybridise with one or both of 
the shown sequences, or the complementary sequences (since DNA 
is generally double- stranded) . 

Similarity or homology (the terms are used interchangeably) or 
identity is preferably determined using GAP, from version 20 
of GCG. This uses the algorithm of Needleman and Wunsch to 
30 align sequences inserting gaps as appropriate to improve the 
agreement between the two sequences . Parameters employed are 
the default ones: for nucleotide sequences - Gap Weight 50, 
Length Weight 3, Average Match 10.000, Average Mismatch 0.000; 
for peptide sequences - Gap Weight 8, Length Weight 2, Average 



20 
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Match 2.912, Average Mismatch -2.003. Peptide similarity 
scores are taken from the BLOSUM62 matrix. Also useful is the 
TBLASTN program, of Altschul et al . (1990) j. Mai. Biol. 215: 
403-10, or BestPit, which is part of the Wisconsin Package, 
Version 8, September 1994, (Genetics Computer Group, 575 
Science Drive, Madison, Wisconsin, USA, Wisconsin 53711) . 
Sequence comparisons may be made using FASTA and FASTP (see 
Pearson & Lipman, 1988. Methods in Enzymology 183: 63-98). 
Parameters are preferably set, using the default matrix, as 
follows: Gapopen (penalty for the first residue in a gap): - 
12 for proteins / -16 for DNA; Gapext (penalty for additional 
residues in a gap) : -2 for proteins / -4 for DNA; KTUP word 
length: 2 for proteins / 6 for DNA. 

Nucleic acid sequence homology may be determined by means of 
selective hybridisation between molecules under stringent 
conditions . 

Preliminary experiments may be performed by hybridising under 
low stringency conditions. For probing, preferred conditions 
are those which are stringent enough for there to be a simple 
pattern with a small number of hybridisations identified as 
positive which can be investigated further. 

For example, hybridizations may be performed, according to the 
method of Sambrook et al. (below) using a hybridization 
solution comprising: 5X SSC (wherein "SSC » 0.15 M sodium 
chloride; 0.15 M sodium citrate; pH 7) , 5X Denhardt ' s reagent, 
0.5-1.0% SDS, 100 /xg/ml denatured, fragmented salmon sperm 
DNA, 0.05% sodium pyrophosphate and up to 50% f ormamide . 
Hybridization is carried out at 37-42°C for at least six 
hours. Following hybridization, filters are washed as 
follows: (1) 5 minutes at room temperature in 2X SSC and 1% 
SDS; (2) 15 minutes at room temperature in 2X SSC and 0.1% 
SDS; (3) 30 minutes - 1 hour at 37°C in ix SSC and 1% SDS; (4) 
2 hours at 42-65°C in IX SSC and 1% SDS, changing the solution 
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every 30 minutes. 

One common formula for calculating the stringency conditions 
required to achieve hybridization between nucleic acid 
molecules of a specified sequence homology is (SambrooJc et 
5 al., 1989): T m = 81.5°C + IS.SLog [Ma+] + 0.41 {% G+C) - 0.63 
(% formamide) - 600/#bp in duplex. 

As an illustration of the above formula, using [Na+] = [0.368] 
and 50-% formamide, with GC content of 42% and an average 
probe size of 200 bases, the T m is 57°C. The T ra of a DNA 

10 duplex decreases by 1 - 1.5°C with every 1% decrease in 

homology. Thus, targets with greater than about 75% sequence 
identity would be observed using a hybridization temperature 
of 42°C. Such a sequence would be considered substantially 
homologous to the nucleic acid sequence of the present 

15 invention. 

It is well known in the art to increase stringency of 
hybridisation gradually until only a few positive clones 
remain. Other suitable conditions include, e.g. for detection 
of sequences that are about 80-90% identical, hybridization 

20 overnight at 42°C in 0.25M Na 2 HP0 4 , pH 7.2, 6.5% SDS, 10% 
dextran sulfate and a final wash at 55°C in 0.1X SSC, 0.1% 
SDS. For detection of sequences that are greater than about 
90% identical, suitable conditions include hybridization 
overnight at 65 °C in 0.25M Na 2 HPO, , pH 7.2, 6.5% SDS, 10% 

25 dextran sulfate and a final wash at 60°C in 0.1X SSC, 0.1% 
SDS. 

In a further embodiment, hybridisation of nucleic acid 
molecule to an allele or variant may be determined or 
identified indirectly, e.g. using a nucleic acid amplification 
30 reaction, particularly the polymerase chain reaction (PGR) . 
PGR requires the use of two primers to specifically amplify 
target nucleic acid, so preferably two nucleic acid molecules 
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with sequences characteristic of the utrophin promoter are 
employed. Using RACE PGR, only one such primer may be needed 
(see "PGR protocols; A Guide to Methods and Applications", 
Eds. Innis et al, Academic Press, New York, (1990)). 

5 Thus a method involving use of PGR in obtaining nucleic acid 
according to the present invention may include: 

(a) providing a preparation of nucleic acid, e.g. from a 
muscle cell; 

(b) providing a pair of nucleic acid molecule primers 
10 useful in (i.e. suitable for) PGR, at least one of said 

primers being a primer specific for nucleic acid according to 
the present invention,- 

(c) contacting nucleic acid in said preparation with said 
primers under conditions for performance of PCR; 

15 (d) performing PCR and determining the presence or 

absence of an amplified PCR product. 

The presence of an amplified PCR product may indicate 
identification of an allele or other variant. The sequence 

may have the ability to promote transcription (i.e. have 
20 "promoter activity") in muscle cells, e.g. human muscle cells, 
or muscle-specific transcription. 

Further provided by the present invention is a nucleic acid 
construct comprising a utrophin promoter region or a fragment, 
mutant, allele, derivative or variant thereof able to promoter 

25 transcription, operably linked to a heterologous gene, e.g. a 
coding sequence. By "heterologous" is meant a gene other than 
utrophin. Modified forms of utrophin are generally excluded. 
Generally, the gene may be transcribed into mRNA which may be 
translated into a peptide or polypeptide product which may be 

30 detected and preferably quant itated following expression. A 

gene whose encoded product may be assayed following expression 
is termed a "reporter gene", i.e. a gene which "reports" on 
promoter activity. 
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The reporter gene preferably encodes an enzyme which catalyses 
a reaction which produces a detectable signal, preferably a 
■visually detectable signal, such as a coloured product. Many 
examples are known, including (J-galactosidase and lucif erase, 
p-galactosidase activity may be assayed by production of blue 
colour on substrate, the assay being by eye or by use of a 
spectrophotometer to measure absorbance. Fluorescence, for 
example that produced as a result of luciferase activity, may 
be quantitated using a spectrophotometer. Radioactive assays 
may be used, for instance using chloramphenicol 
acetyltransferase, which may also be used in non-radioactive 
assays. The presence and/or amount of gene product resulting 
from expression from the reporter gene may be determined using 
a molecule able to bind the product, such as an antibody or 
fragment thereof . The binding molecule may be labelled 
directly or indirectly using any standard technique. 

Those skilled in the art are well aware of a multitude of 
possible reporter genes and assay techniques which may be used 
to determine gene activity. Any suitable reporter/assay may 
be used and it should be appreciated that no particular choice 
is essential to or a limitation of the present invention. 

Expression of a reporter gene from the promoter may be in an 
in vitro expression system or may be intracellular {in vivo) . 
Expression generally requires the presence, in addition to the 
promoter which initiates transcription, a translational 
initiation region and transcriptional and translational 
termination regions. One or more introns may be present in 
the gene, along with mRNA processing signals (e.g. splice 
sites) . 

Systems for cloning and expression of a polypeptide are 
discussed further below. 

The present invention also provides a nucleic acid vector 
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comprising a promoter as disclosed herein. Such a vector may- 
comprise a suitably positioned restriction site or other means 
for insertion into the vector of a sequence heterologous to 
the promoter to be operably linked thereto. 

5 Suitable vectors can be chosen, or constructed, containing 
appropriate regulatory sequences, including promoter 
sequences, terminator fragments, polyadenylation sequences, 
enhancer sequences, marker genes and other sequences as 
appropriate. For further details see, for example, Molecular 
10 Cloning: a Laboratory Manual: 2nd edition, Sambrook et al, 
1989, Cold Spring Harbor Laboratory Press. Procedures for 
introducing DNA into cells depend on the host used, but are 
well known. 

Thus, a further aspect of the present invention provides a 
15 host cell containing a nucleic acid construct comprising a 
promoter element, as disclosed herein, operably linked to a 
heterologous gene. A still further aspect provides a method 
comprising introducing such a construct into a host cell. The 
introduction may employ any available technique, including, 
20 for eukaryotic cells, calcium phosphate transf ection, DEAE- 
Dextran transfection, elec.troporation, liposome-mediated 
transf ection and transduction using retrovirus. 

The introduction may be followed by causing or allowing 
expression of the heterologous gene under the control of the 
25 promoter, e.g. by culturing host cells under conditions for 
expression of the gene. 

In one embodiment, the construct comprising promoter and gene 
is integrated into the genome (e.g. chromosome) of the host 
cell. Integration may be promoted by inclusion in the 
30 construct of sequences which promote recombination with the 
genome, in accordance with standard techniques. 
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Many known techniques and protocols for manipulation of 
nucleic acid, for example in preparation of nucleic acid 
constructs, mutagenesis, sequencing, introduction of DNA into 
cells and gene expression, and analysis of proteins, are 
5 described in detail in Current Protocols in Molecular Biology, 
Second Edition, Ausubel et al . eds., John Wiley & Sons, 1994, 
the disclosure of which is incorporated herein by reference. 

Nucleic acid molecules, constructs and vectors according to 
the present invention may be provided isolated and/or purified 

10 (i.e. from their natural environment), in substantially pure 
or homogeneous form, free or substantially free of a utrophin 
coding sequence, or free or substantially free of nucleic acid 
or geneB of the species of interest or origin other than the 
promoter sequence. Nucleic acid according to the present 

15 invention may be wholly or partially synthetic. The term 
"isolate" encompasses all these possibilities. 

Nucleic acid constructs comprising a promoter (as disclosed 
herein) and a heterologous gene (reporter) may be employed in 
screening for a substance able to modulate utrophin promoter 

20 activity. For therapeutic purposes, e.g. for. treatment of 

muscular dystrophy, a substance able to up-regulate expression 
of the promoter may be sought. A method of screening for 
ability of a substance to modulate activity of a utrophin 
promoter may comprise contacting an expression system, such as 

25 a host cell, containing a nucleic acid construct as herein 

disclosed with a test or candidate substance and determining 
expression of the heterologous gene. The level of 
transcription of the heterologous gene, or the level of 
heterologous protein may be determined. The level of protein 

30 may be determined by measuring the amount of protein, or the 
activity of the protein, using techniques known to those 
skilled in the art. 



Alternatively, or additionally a method of screening for 
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ability of a substance to modulate activity of a utrophin 
promoter may comprise contacting a cell containing an 
endogenous utrophin gene (e.g. a mammalian muscle cell) with a 
test substance and measuring the level of RNA transcription or 
5 protein expression using binding members specific for the 
nucleic acid or polypeptides disclosed herein. Specific 
binding members include antibodies and nucleic acid probes. 

The level of expression in the presence of the test substance 
may be compared with the level of expression in the absence of 

10 the test substance. A difference in expression in the 
presence of the test substance indicates ability of the 
substance to modulate gene expression. An increase in 
expression of the heterologous gene compared with expression 
of another gene not linked to a promoter as disclosed herein 

15 indicates specificity of the substance for modulation of the 
utrophin promoter. 

A promoter construct may be transfected into a cell line using 
any technique previously described to produce a stable cell 
line containing the reporter construct integrated into the 

20 genome. The cells may be grown and incubated with test 

compounds for varying times. The cells may be .grown in 96 
well plates to facilitate the analysis of large numbers of 
compounds . The cells may then be washed and the reporter gene 
expression analysed. For some reporters, such as lucif erase, 

25 the cells will be lysed then analysed. Previous experiments 
testing the effects of glucocorticoids on the endogenous 
utrophin protein and RNA levels in myoblasts have already been 
described [12,13] and techniques used for those experiments 
may similarly be employed. 

30 Constructs comprising one or more developmental and/or time- 
specific regulatory motifs (as discussed) may be used to 
screen for a substance able to modulate the corresponding 
aspect of the promoter activity, e.g. muscle-specific 
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expression. 

Following identification of a substance which modulates or 
affects utrophin promoter activity, the substance may be 
investigated further. Furthermore, it may be manufactured 
and/or used in preparation, j i . e . manufacture or formulation, 
of a composition such as a medicament, pharmaceutical 
composition or drug. These may be administered to 
individuals . 

As noted above, the inventors also identified a novel coding 
sequence (Exon IB) which encodes a novel utrophin N- terminus. 

According to a further aspect of the present invention there 
is provided a nucleic acid molecule which has a nucleotide 
sequence encoding a polypeptide which includes the amino acid 
sequence shown in Figure 1 or Figure 2 . 

Such a polypeptide may include other utrophin sequences, and 
the nucleic acid molecule may be in the form of a utrophin 
"mini-gene" (discussed further below) . 

Such a polypeptide may include non-utrophin (i.e. heterologous 
or foreign) sequences and thereby form a larger fusion 
protein. For example, such a fusion protein could be used to 
target a non-utrophin polypeptide to muscle membranes . 

The coding sequence included may be that shown in Figure 1 or 
Figure 2 or it may be a mutant, variant, derivative or allele 
of the sequence shown. The sequence may differ from that 
shown by a change which is one or more of addition, insertion, 
deletion and substitution of one or more nucleotides of the 
sequence shown. Changes to a nucleotide sequence may result 
in an amino acid change at the protein level, or not, as 
determined by the genetic code. 

Thus, nucleic acid according to the present invention may 



WO 01/25461 



PCT/GBOO/03800 



17 

include a sequence different from the sequences shown in 
Figure 1 or Figure 2 yet encode a polypeptide with the same 
amino acid sequence. The amino acid sequences shown in Figure 
1 and figure 2 consist of 31 residues. 

On the other hand the encoded polypeptide may comprise an 
amino acid sequence which differs by one or more amino acid 
residues from the amino acid sequences shown in Figure 1 or 
Figure 2. Nucleic acid encoding a polypeptide which is an 
amino acid sequence mutant, variant, derivative or allele of 
the sequences shown in Figure 1 and Figure 2 are further 
provided by the present invention. Nucleic acid encoding 
such a polypeptide may show at the nucleotide sequence and/or 
encoded amino acid level greater than about 60% homology with 
the coding sequence and/or the amino acid sequence shown in 
Figure 1 or Figure 2, greater than about 70% homology, greater 
than about 80% homology, greater than about 90% homology or 
greater than about 95% homology. Determination of homology is 
discussed elsewhere herein. 

A polypeptide which is a variant, allele, derivative or mutant 
may have an amino acid sequence which differs from that given 
in a figure herein by one or more of addition, substitution, 
deletion and insertion of one or more amino acids. Preferred 
such polypeptides have wild- type function, that is to say have 
one or more of the following properties: immunological cross- 
reactivity with an antibody reactive the polypeptide for which 
the sequence is given in Figure 1 or Figure 2 ; sharing an 
epitope with the polypeptide for which the amino acid sequence 
is shown in Figure 1 or Figure 2 (as determined for example by 
immunological cross -reactivity between the two polypeptides) ,* 
a biological activity which is inhibited by an antibody raised 
against the polypeptide whose sequence is shown in Figure 1 or 
Figure 2; ability to bind muscle membrane, ability to bind 
actin; ability to bind DPC. 
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Variations in amino acid sequence include "conservative 
variation", i.e. substitution of one hydrophobic residue such 
as isoleucine, valine, leucine or methionine for another, or 
the substitution of one polar residue for another, such as 
arginine for lysine, glutamic for aspartic acid, or glutamine 
for asparagine. Particular amino acid sequence variants may 
differ from that shown in Figure 1 or Figure 2 by insertion, 
addition, substitution or deletion of 1 amino acid, 2, 3, 4, 
or 5-10 amino acids. 

According to one aspect of the present invention there is 
provided a nucleic acid molecule comprising a sequence of 
nucleotides encoding a polypeptide with utrophin function. 
Utrophin nucleotide sequences which may be included in the 
nucleic acid molecule are disclosed in WO 97/922696 which is 
incorporated herein by reference. 

See also Figure 8 and Figure 9 for disclosure of nucleic acid 
molecules and polypeptides according to the present invention, 
comprising the exon IB sequence of the invention. 

A polypeptide with utrophin function is able to bind actin and 
able to bind the dystrophin protein complex (DPC) . 

The nucleic acid molecule may be an isolate, or in an isolated 
and/or purified form, that is to say not in an environment in 
which it is found in nature, removed from its natural 
environment. It may be free from other nucleic acid 
obtainable from the same species, e.g. encoding another 
polypeptide. 

In one embodiment, nucleic acid molecule is a "mini-gene", 
i.e. the polypeptide encoded does not correspond to full- 
length utrophin but is rather shorter, a truncated version 
(Utrophin mini-genes are discussed in W097/22696) . For 
instance, part or all of the rod domain may be missing, such 
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that the polypeptide comprises an actin-binding domain and a 
DPC-binding domain but is shorter than naturally occurring 
utrophin. In a full-length utrophin gene including what are 
identified herein as exons 1A and IB, the actin-binding domain 
5 is encoded by nucleotides 1-739, while the DPC-binding domain 
(CRCT) is encoded by nucleotides 8499-10301 (where 1 
represents the start of translation) . See also Figure 8 . The 
respective domains in the polypeptide encoded by a mini-gene 
according to the invention may comprise amino acids 

10 corresponding to those encoded by these nucleotides in the 
full-length coding sequence. In one embodiment, a minigene 
according to the present invention comprises or consists of 
the amino acid sequence encoded by nucleotides 1-739 and 8499- 
10301 of the A isoform of utrophin in which exon IB as 

15 identified herein is substituted for exons 1A and 2A. The 

sequence of such a minigene can be constructed by the ordinary 
skilled person using information disclosed herein, taking into 
account the content of W097/22696 and Tins ley et al, Mature 
(1996) 384:349. The nucleic acid sequence and predicted 

20 amino acid sequence encoded by a v mini-gene' according to the 
present invention are shown in Figure 9 . 

Advantages of a mini -gene over a sequence encoding a full- 
length utrophin molecule or derivative thereof include easier 
25 manipulation and inclusion in vectors, such as adenoviral and 
retroviral vectors for delivery and expression. 

A further preferred non-naturally occurring nucleic acid 
molecule encoding a polypeptide with the specified 
characteristics is a chimaeric construct wherein the encoding 

30 sequence comprises a sequence obtainable from one mammal, 
preferably human ("a human sequence"), and a sequence 
obtainable from another mammal, preferably mouse ("a mouse 
sequence") . Such a chimaeric construct may of course comprise 
the addition, insertion, substitution and/or deletion of one 

35 or more nucleotides with respect to the parent mammalian 
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sequences from which it is derived. Preferably, the part of 
the coding sequence which encodes the actin-binding domain 
comprises a sequence of nucleotides obtainable from the mouse, 
or other non-human mammal, or a sequence of nucleotides 
5 derived from a sequence obtainable from the mouse, or other 
non- human mammal. 

In a preferred embodiment, the sequence of nucleotides 
encoding the polypeptide comprises sequence GAG6CAC at 
residues 331-337 and/or the sequence GATTGTGGATGAAAACAGTGGG at 
10 residues 1453-1475 (using the conventional numbering from the 
initiation codon ATG) , and a sequence obtainable from a human . 

Nucleic acid according to the present invention is obtainable 
using one or more oligonucleotide probes or primers designed 
to hybridise with one or more fragments of a nucleic acid 
sequence shown in Figure 1 or Figure 2 particularly fragments 
of relatively rare sequence, based on codon usage or 
statistical analysis. The amino acid sequence information 
provided may be used in design of degenerate probes/primers or 
"long" probes. A primer designed to hybridise with a fragment 
of the nucleic acid sequence shown may be used in conjunction 
with one or more, oligonucleotides designed to hybridise to a 
sequence in a cloning vector within which target nucleic acid 
has been cloned, or in so-called "RACE " (rapid amplification 
of cDNA ends) in which cDNA's in a library are ligated to an 
oligonucleotide linker and PGR is performed using a primer 
which hybridises with the sequence shown in the figures and a 
primer which hybridises to the oligonucleotide linker . 

Nucleic acid isolated and/or purified from one or more cells 
(e.g. human, mouse) or a nucleic acid library derived from 
30 nucleic acid isolated and/or purified from cells (e.g. a cDNA 
library derived from mRNA isolated from the cells) , may be 
probed under conditions for selective hybridisation and/or 
subjected to a specific nucleic acid amplification reaction 



20 
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such as the polymerase chain reaction (PGR) . 

A method may include hybridisation of one or more (e.g. two) 

probes or primers to target nucleic acid. Where the nucleic 

/ 

acid is double- stranded DNA, hybridisation will generally be 
5 preceded by denaturation to' produce single-stranded DNA. The 
hybridisation may be as part of a PGR procedure, or as part of 
a probing procedure not involving PCR. An example procedure 
would be a combination of PCR and low stringency 
hybridisation. A screening procedure, chosen from the many 
10 available to those skilled in the art, is used to identify 
successful hybridisation events and isolated hybridised 
nucleic acid. 

Probing may employ the standard Southern blotting technique. 
For instance DNA may be extracted from cells and digested with 

15 different restriction enzymes. Restriction fragments may then 
be separated by electrophoresis on an agarose gel, before 
denaturation and transfer to a nitrocellulose filter. 
Labelled probe may be hybridised to the DNA fragments on the 
filter and binding determined. DNA for probing may be 

20 prepared from RNA preparations from cells. 

Preliminary experiments may be performed by hybridising under 
low stringency conditions various probes to Southern blots of 
DNA digested with restriction enzymes . Suitable conditions 
would be achieved when a large number of hybridising fragments 
25 were obtained while the background hybridisation was low. 
Using these conditions nucleic acid libraries, e.g. cDNA 
libraries representative of expressed sequences, may be 
searched. 

It may be necessary for one or more gene fragments to be 
30 ligated to generate a full-length coding sequence. Also, 
where a full-length encoding nucleic acid molecule has not 
been obtained, a smaller molecule representing part of the 
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full molecule, may be used to obtain full-length clones. 
Inserts may be prepared from partial cDNA clones and used to 
screen cDNA libraries. 

Those skilled in the art are well able to employ suitable 
5 conditions of the desired stringency for selective 
hybridisation, taking into account factors such as 
oligonucleotide length and base composition, temperature and 
so on. Exemplary conditions have been discussed already 
above . 

10 Nucleic acid according to the present invention may form part 
of a cloning vector and/or a vector from which the encoded 
polypeptide may be expressed. Polypeptide expression is 
discussed below. Suitable vectors can be chosen or 
constructed, containing appropriate and appropriately 

15 positioned regulatory sequences, as discussed elsewhere 
herein. 

A further aspect of the present invention provides a 
polypeptide which comprises the amino acid sequence shown in 
Figure 1 or Figure 2. As mentioned earlier such a polypeptide 
20 may include other utrophin sequences or may include 
heterologous sequences. 

Polypeptides which are amino acid sequence variants, alleles, 
derivatives or mutants are also provided by the present 
invention. Such polypeptides are discussed elsewhere herein. 

25 The skilled person can use the techniques described herein and 
others well known in the art to produce large amounts of 
peptides, for instance by expression from encoding nucleic 
acid. 

in a further aspect the invention provides a method of making 
30 a polypeptide, the method including expression from nucleic 
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acid encoding the polypeptide (generally nucleic acid 
according to the invention) . This may be conveniently be 
achieved by growing in culture a host cell containing such a 
vector, under suitable conditions which cause or allow 
5 expression of the polypeptide. Polypeptides may alBo be 
expressed in in vitro systems such as reticulocyte lysate. 

Systems for cloning and expression of a polypeptide in a 
variety of different host cells are well known. Suitable host 
cells include bacteria, mammalian cells, yeast and baculovirus 
10 systems. Mammalian cell lines available in the art for 
expression of a heterologous polypeptide include Chinese 
hamster ovary cells, HeLa cells, baby hamster kidney cells and 
many others. A common, preferred bacterial host is E. coli. 

Thus, a further aspect of the present invention provides a 
15 host cell containing heterologous nucleic acid encoding a 
polypeptide as disclosed herein. 

The nucleic acid may be integrated into the genome (e.g. 
chromosome) of the host cell or may be on an extra- chromosomal 
vector within the cell, or otherwise identifiably heterologous 

20 or foreign to the cell. 

A still further aspect provides a method comprising 
introducing such nucleic acid into a host cell. Suitable 
techniques are discussed elsewhere herein. 

The introduction may be followed by causing or allowing 
25 expression from the nucleic acid, e.g. by culturing host cells 

under conditions for expression of the gene. 

The polypeptide encoded by the nucleic acid may be expressed 

from the nucleic acid in vitro, e.g. in a cell-free system or 

in cultured cells, or in vivo. 
30 If the polypeptide is expressed coupled to an appropriate 

signal leader peptide it may be secreted from the cell into 
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the culture medium. 

Peptides can also be generated wholly or partly by chemical 
synthesis , The compounds of the present invention can be 
readily prepared according to well-established, standard 
liquid or, preferably, solid-phase peptide synthesis methods, 
general descriptions of which are broadly available (see, for 
example, in J.M. Stewart and J.D. Young, Solid Phase Peptide 
Synthesis, 2nd edition, Pierce Chemical Company, Rockf ord, 
Illinois (1984) , in M. Bodanzsky and A. Bodanzsky, The 
Practice of Peptide Synthesis, Springer Verlag, New York 
(1984); and Applied Biosystems 430A Users Manual, ABI Inc., 
Poster City, California) , or they may be prepared in solution, 
by the liquid phase method or by any combination of solid- 
phase, liquid phase and solution chemistry, e.g. by first 
completing the respective peptide portion and then, if desired 
and appropriate, after removal of any protecting groups being 
present, by introduction of the residue X by reaction of the 
respective carbonic or sulfonic acid or a reactive derivative 
thereof . 

The present invention also includes active portions, 
fragments, derivatives and functional mimetics of the 
polypeptides of the invention. An "active portion" of a 
polypeptide means a peptide which is less than said full 
length polypeptide, but which retains a biological activity, 
such as a biological activity selected from binding to ligand, 
binding to muscle membrane. Such an active fragment may be 
included as part of a fusion protein, e.g. including a 
polypeptide which is to be targetted to the muscle membrane. 

A "fragment" of a polypeptide generally means a stretch of 
amino acid residues of about five to twenty-five contiguous 
amino acids, typically about ten to twenty contiguous amino 
acids. Fragments of the novel N-terminus polypeptide sequence 
may include antigenic determinants or epitopes useful for 
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raising antibodies to a portion of the amino acid sequence, or 
may be sequence useful for targetting to muscle membrane. 
Alanine scans are commonly used to find and refine peptide 
motifs within polypeptides, this involving the systematic 
replacement of each residue in turn with the amino acid 
alanine, followed by an assessment of biological activity. 

Preferred fragments of exon IB polypeptide include those 
comprising or consisting of an epitope which may be used for 
instance in raising or isolating antibodies . Variant and 
derivative peptides, peptides which have an amino acid 
sequence which differs from one of these sequences by way of 
addition, insertion, deletion or substitution of one or more 
amino acids are also provided by the present invention. 

A "derivative" of a polypeptide or a fragment thereof may 
include a polypeptide modified by varying the amino acid 
sequence of the protein, e.g. by manipulation of the nucleic 
acid encoding the protein or by altering the protein itself. 
Such derivatives of the natural amino acid sequence may 
involve one or more of insertion, addition, deletion or 
substitution of one or more amino acids, which may be without 
fundamentally altering the qualitative nature of biological 
activity of the wild type polypeptide. Also encompassed 
within the scope of the present invention are functional 
mimetics of active fragments of the exon IB polypeptides 
provided (including alleles, mutants, derivatives and 
variants) . The term "functional mimetic" means a substance 
which may not contain an active portion of the relevant amino 
acid sequence, and probably is not a peptide at all, but which 
retains in qualitative terms biological activity of natural 
exon IB polypeptide. The design and screening of candidate 
mimetics is described in detail below, 

A polypeptide according to the present invention may be 
isolated and/or purified (e.g. using an antibody) for instance 



WO 01/25461 



PCT/GB00/0380O 



after production by expression from encoding nucleic acid (for 
which see below) . Thus, a polypeptide may be provided free or 
substantially free from contaminants with which it is 
naturally associated (if it is a naturally- occurring 
5 polypeptide) . A polypeptide may be provided free or 
substantially free of other polypeptides. Polypeptides 
according to the present invention may be generated wholly or 
partly by chemical synthesis. The isolated and/or purified 
polypeptide may be used in formulation of a composition, which 

10 may include at least one additional component, for example a 
pharmaceutical composition including a pharmaceutically 
acceptable excipient, vehicle or carrier. A composition 
including a polypeptide according to the invention may be used 
in prophylactic and/or therapeutic treatment as discussed 

15 below. 

A polypeptide, peptide, allele, mutant, derivative or variant 
according to the present invention may be used as an immunogen 
or otherwise in obtaining specific antibodies. Antibodies are 
useful in purification and other manipulation of polypeptides 
20 and peptides, diagnostic screening and therapeutic contexts. 

. Accordingly, a further aspect of the present invention 
provides an antibody able to bind specifically to the 
polypeptide whose sequence is given in Figure 1 or Figure 2. 
Such an antibody may be specific in the sense of being able to 

25 distinguish between the polypeptide it is able to bind and 
other human (or mouse) polypeptides for which it has no or 
substantially no binding affinity (e.g. a binding affinity of 
about lOOOx less) . Specific antibodies bind an epitope on the 
molecule which is either not present or is not accessible on 

30 other molecules. Antibodies according to the present 

invention may be specific for the wild- type polypeptide. 
Antibodies according to the invention may be specific for a 
particular mutant, variant, allele or derivative polypeptide 
as between that molecule and the wild- type polypeptide, so as 
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to be useful in diagnostic and prognostic methods as discussed 
below. Antibodies are also useful in purifying the 
polypeptide or polypeptides to which they bind, e.g. following 
production by recombinant expression from encoding nucleic 

acid. 

Preferred antibodies according to the invention are isolated, 
in the sense of being free from contaminants such as 
antibodies able to bind other polypeptides and/or free of 
serum components. Monoclonal antibodies are preferred for 
some purposes, though polyclonal antibodies are within the 
scope of the present invention. 

Antibodies may be obtained using techniques which are standard 
in the art. Methods of producing antibodies include 
immunising a mammal (e.g. mouse, rat, rabbit, horse, goat, 
sheep or monkey) with the protein or a fragment thereof. 
Antibodies may be obtained from immunised animals using any of 
a variety of techniques known in the art, and screened, 
preferably using binding of antibody to antigen of interest. 
For instance, Western blotting techniques or 
immunoprecipitation may be used (Armitage et al., 1992, 
Nature 357: 80-82) . isolation of antibodies and/or antibody- 
producing cells from an animal may be accompanied by a step of 
sacrificing the animal. 

As an alternative or supplement to immunising a mammal with a 
peptide, an antibody specific for a protein may be obtained 
from a recombinantly produced library of expressed 
immunoglobulin variable domains, e.g. using lambda 
bacteriophage or filamentous bacteriophage which display 
functional immunoglobulin binding domains on their surfaces ; 
for instance see WO92/01047. The library may be naive, that 
is constructed from sequences obtained from an organism which 
has not been immunised with any of the proteins (or 
fragments) , or may be one constructed using sequences obtained 
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from an organism which has been exposed to the antigen of 
interest. 

Antibodies according to the present invention may be modified 
in a number of ways. Indeed the term "antibody" should be 
construed as covering any binding substance having a binding 
domain with the required specificity. Thus the invention 
covers antibody fragments, derivatives, functional equivalents 
and homologues of antibodies, including synthetic molecules 
and molecules whose shape mimicks that of an antibody enabling 
it to bind an antigen or epitope. 

Example antibody fragments, capable of binding an antigen or 
other binding partner are the Pab fragment consisting of the 
VL, VH, CI and CHI domains; the Fd fragment consisting of the 
VH and CHI domains,- the Pv fragment consisting of the VL and 
VH domains of a single arm of an antibody; the dAb fragment 
which consists of a VH domain; isolated CDR regions and 
F(ab')2 fragments, a bivalent fragment including two Fab 
fragments linked by a disulphide bridge at the hinge region. 
Single chain Fv fragments are also included. 

A hybridoma producing a monoclonal antibody according to the. 
present invention may be subject to genetic mutation or other 
changes. It will further be understood by those skilled in 
the art that a monoclonal antibody can be subjected to the 
techniques of recombinant DNA technology to produce other 
antibodies or chimeric molecules which retain the specificity 
of the original antibody. Such techniques may involve 
introducing DNA encoding the immunoglobulin variable region, 
or the complementarity determining regions (CDRs) , of an 
antibody to the constant regions, or constant regions plus 
framework regions, of a different immunoglobulin. See, for 
instance, EP184187A, GB 2188638A or EP-A-0239400 . Cloning and 
expression of chimeric antibodies are described in EP-A- 
0120694 and EP-A-0125023 . 



WO 01/25461 



PCT/GBOO/03800 



Hybridomas capable of producing antibody with desired binding 
characteristics are within the scope of the present 
invention, as are host cells, eukaryotic or prokaryotic, 
containing nucleic acid encoding antibodies (including 
5 antibody fragments) and capable of their expression. The 
invention also provides methods of production of the 
antibodies including growing a cell capable of producing the 
antibody under conditions in which the antibody is produced, 
and preferably secreted. 

10 The reactivities of antibodies on a sample may be determined 
by any appropriate means. Tagging with individual reporter 
molecules is one possibility. The reporter molecules may 
directly or indirectly generate detectable, and preferably 
measurable, signals. The linkage of reporter molecules may be 

15 directly or indirectly, covalently, e.g. via a peptide bond or 
non-covalently. Linkage via a peptide bond may be as a result 
of recombinant expression of a gene fusion encoding antibody 
and reporter molecule. 

One favoured mode is by covalent linkage of each antibody with 
20 an individual f luorochrome, phosphor or laser dye with 

spectrally isolated absorption or emission characteristics. 
Suitable f luorochromes include fluorescein, rhodamine, 
phycoerythrin and Texas Red. Suitable chromogenic dyes 
include diaminobenzidine . 

25 Other reporters include macromolecular colloidal particles or 
particulate material such as latex beads that are coloured, 
magnetic or paramagnetic, and biologically or chemically 
active agents that can directly or indirectly cause detectable 
signals to be visually observed, electronically detected or 

30 otherwise recorded. These molecules may be enzymes which 
catalyse reactions that develop or change colours or cause 
changes in electrical properties, for example. They may be 
molecularly excitable, such that electronic transitions 
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between energy states result in characteristic spectral 
absorptions or emissions. They may include chemical entities 
used in conjunction with biosensors. Biotin/avidin or 
biotin/streptavidin and alkaline phosphatase detection systems 
5 may be employed. 

The mode of determining binding is not a feature of the 
present invention and those skilled in the art are able to 
choose a suitable mode according to their preference and 
general knowledge. Particular embodiments of antibodies 
10 according to the present invention include antibodies able to 
bind and/or which bind specifically, e.g. with an affinity of 
at least 10" 7 M, to the peptides shown in Figure 1 or Figure 2. 

Antibodies according to the present invention may be used in 
screening for the presence of a polypeptide, for example in a 
15 test sample containing cells or cell lysate as discussed, and 
may be used in purifying and/or isolating a polypeptide 
according to the present invention, for instance following 
production of the polypeptide by expression from encoding 
nucleic acid therefor. 

20 . An antibody may be provided in a kit, which may include 

instructions for use of the antibody, e.g. in determining the 
presence of a particular substance in a test sample. One or 
more other reagents may be included, such as labelling 
molecules, buffer solutions, elutants and so on. Reagents may 

25 be provided within containers which protect them from the 
external environment, such as a sealed vial. 

The present invention extends in various aspects not only to a 
substance identified using a nucleic acid molecule as a 
modulator of utrophin promoter activity, or to a polypeptide, 
30 or nucleic acid molecule in accordance with what is disclosed 
herein, but also a pharmaceutical composition, medicament, 
drug or other composition comprising such a substance, a 
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method comprising administration of such a composition to a 
patient, e.g. for increasing utrophin expression for instance 
in treatment of muscular dystrophy, use of such a substance in 
manufacture of a composition for administration, e.g. for 
5 increasing utrophin expression for instance in treatment of 
muscular dystrophy, and a method of making a pharmaceutical 
composition comprising admixing such a substance with a 
pharmaceutically acceptable excipient, vehicle or carrier, and 
optionally other ingredients. 

10 Administration will preferably be in a "therapeutically 

effective amount", this being sufficient to show benefit to a 
patient. Such benefit may be at least amelioration of at 
least one symptom. The actual amount administered, and rate 
and time-course of administration, will depend on the nature 

15 and severity of what is being treated. Prescription of 
treatment, eg decisions on dosage etc, is within the 
responsibility of general practitioners and other medical 
doctors . 

A composition may be administered alone or in combination with 
20 other treatments, either simultaneously or sequentially 
dependent upon the condition. to be treated. 

Pharmaceutical compositions according to the present 
invention, and for use in accordance with the present 
invention, may comprise, in addition to active ingredient, a 

25 pharmaceutically acceptable excipient, carrier, buffer, 

stabiliser or other materials well known to those skilled in 
the art. Such materials should be non-toxic and should not 
interfere with the efficacy of the active ingredient. The 
precise nature of the carrier or other material will depend on 

30 the route of administration, which may be oral, or by 

injection, e.g. cutaneous, subcutaneous or intravenous. 

Pharmaceutical compositions for oral administration may be in 
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tablet, capsule, powder or liquid form. A tablet may comprise 
a solid carrier such as gelatin or an adjuvant. Liquid 
pharmaceutical compositions generally comprise a liquid 
carrier such as water, petroleum, animal or vegetable oils, 
mineral oil or synthetic oil. Physiological saline solution, 
dextrose or other saccharide solution or glycols such as 
ethylene glycol, propylene glycol or polyethylene glycol may 
be included. 

For intravenous, cutaneous or subcutaneous injection, or 
injection at the site of affliction, the active ingredient 
will be in the form of a parenterally acceptable aqueous 
solution which is pyrogen- free and has suitable pH, 
isotonicity and stability. Those of relevant skill in the art 
are well able to prepare suitable solutions using, for 
example, isotonic vehicles such as Sodium Chloride Injection, 
Ringer's Injection, Lactated Ringer's Injection. 
Preservatives, stabilisers, buffers, antioxidants and/or other 
additives may be included, as required. 

Instead of a substance identified using a promoter as 
disclosed herein, a mimetic or miraick or the substance may be 
designed for pharmaceutical use. The designing of mimetics-to 
a known pharmaceutically active compound is a known approach 
to the development of pharmaceuticals based on a "lead" 
compound. This might be desirable where the active compound 
is difficult or expensive to synthesise or where it is 
unsuitable for a particular method of administration, eg 
peptides are unsuitable active agents for oral compositions as 
they tend to be quickly degraded by proteases in the 
alimentary canal. Mimetic design, synthesis and testing may 
be used to avoid randomly screening large number of molecules 
for a target property. 

There are several steps commonly taken in the design of a 
mimetic from a compound having a given target property. 
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Firstly, the particular parts of the compound that are 
critical and/or important in determining the target property 
are determined. In the case of a peptide, this can be done by 
systematically varying the amino acid residues in the peptide, 
5 eg by substituting each residue in turn. These parts or 

residues constituting the active region of the compound are 
known as its "pharmacophore" . 

Once the pharmacophore has been found, its structure is 
modelled to according its physical properties, eg 

10 stereochemistry, bonding, size and/or charge, using data from 
a range of sources, eg spectroscopic techniques, X-ray 
diffraction data and NMR . Computational analysis, similarity 
mapping (which models the charge and/or volume of a 
pharmacophore, rather than the bonding between atoms) and 

15 other techniques can be used in this modelling process. 

In a variant of this approach, the three-dimensional structure 
of the ligand and its binding partner are modelled. This can 
be especially useful where the ligand and/or binding partner 
change conformation on binding, allowing the model to take 

20 account of this the design of the mimetic. 

A template molecule is then selected onto which chemical 
groups which mimic the pharmacophore can be grafted. The 
template molecule and the chemical groups grafted on to it can 
conveniently be selected so that the mimetic is easy to 

25 synthesise, is likely to be pharmacologically acceptable, and 
does not degrade in vivo, while retaining the biological 
activity of the lead compound. The mimetic or mimetics found 
by this approach can then be screened to see whether they have 
the target property, or to what extent they exhibit it . 

30 Further optimisation or modification can then be carried out 
to arrive at one or more final mimetics for in vivo or 
clinical testing. 

Mimetics of substances identified as having ability to 
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modulate utrophin promoter activity using a screening method 
as disclosed herein are included within the scope of the 
present invention. 

Modifications to and further aspects and embodiments of the 
5 present invention will be apparent to those skilled in the 
art. All documents mentioned herein are incorporated by 
reference. 

Experimental basis for and embodiments of the present 
invention will now be described in more detail, by way of 
10 example and not limitation, and with reference to the 
following figures: 

Figure 1 shows the sequence of the human exon IB and promoter 
B. Numbering corresponds to the insert of pBSX2.0. The deduced 
translation of exon IB is shown. The positions of features 
15 such as restriction sites, IL-6 response element and Alu 
repetitive elements are shown. 

Figure 2 shows the sequence of the mouse exon IB and promoter 
B. Numbering corresponds to the insert of pBSXS.O. The deduced 
translation of exon IB is shown. The positions of features 
20 such as restriction sites, IL-6 response element and Alu 
repetitive elements are shown. 

Figure 3 shows the sequence alignment of human (top) and mouse 
(bottom) exon IB (in upper case) and promoter B . Numbering 
corresponds to the inserts of pBSX2.0 and pBSXS.O, 

25 respectively. The human PvuII site (see Figure 7) is 

indicated. The open triangle indicates the position at which 
the lucif erase coding sequence was inserted to make 
pGL3/UtroB/F (see below) . The deduced translation of exon IB 
is shown,- amino acids marked in bold type are identical 

30 between the human and mouse sequences. The conserved splice 
donor consensus is shown in grey. Two putative Apl sites and 
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an initiator- like element (Inr) are 100% conserved and 
indicated in black. A solid arrow marks the single 
transcription start indicated by primer extension; figures 
adjacent to the sequence indicate the number of individual 
5 5' RACE clones that terminated at the positions shown. 

Figure 4 shows the position of the primers used in RT-PCR of 
exon IB- containing utrophin transcript, and the probes used to 
probe the PGR products. Primers specific to exon IB (BF3U and 
utrophin C-terminus (CT2) were used to amplify 981Sbp of 

10 utrophin cDNA. The products were blotted and probed with U41, 
U107, BR4 and U16 as indicated. The diagram is not to scale; 
numbering refers to the nucleotide sequence of the full-length 
cDNA. The corresponding functional domains of the protein are 
indicated above: actin binding domain; rod, rod domain; Cys, 

15 cysteine rich domain, C-Term,- C- terminal domain. 

Figure 5 shows a schematic representation of (A) human YAC 
and (B) mouse PAC contigs showing position of exons within the 
genomic map. Key to mouse restriction sites: C, Clal; S, 
20 SacII; B, BssHll,- X, Xhol . (C) shows the nomenclature for 
utrophin promoters, exons and transcripts. 

Figure 6 shows the in vitro activity of utrophin promoter B. 
(A) shows normalised luciferase activity following 
transfection of three different human cell types with either 
25 pGL3/utroB/F ("forward construct') or pGL3/utroB/R ("reverse 
construct' ) . 

Figure 7 shows deletion analysis of promoter B. The l.Skb 
insert of pGL3/utroB/F was deleted at its 5' and 3' ends using 
the internal restriction sites indicated. Reporter activity 
30 was assayed following transient transfection of INI 5 7 and 
CL11T47 cells. 

Figure 8 shows conceptual translation of exon IB as part of 
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utrophin, showing a nucleotide sequence and encoded 
polypeptide according to embodiments of the present invention. 

Figure 9 shows the nucleic acid and predicted amino acid 
sequence of a utrophin B isof orm "minigene * . 

figure 10 shows the dosage dependence of IL-6 mediated 
expression from the isof orm B promoter. 

Oligonucleotides, PGR, RT-PCR and 5 'RACE 

PCR and RT-PCR were performed as described (Blake, et al. 
(1996) J Biol Chem 271, 7802-7810) . Oligonucleotide sequences 
(5* to 3 1 ) were: 



UM83 


gatgttcctg 


tgaggccttc gag, 


DM82 


cactcttgga 


aaatcgagcg t, 


U16 


actatgatgt 


ctgccagagt tg, 


U107 


gatccaatag 


cttccttcca tcttt, 


UBP 


tggaaaaagt 


ggaggttgga, 


BR2 


tccaacctcc 


actttttcca, 


BR4 


gcctggagag 


ctacatgccc t, 


BPS 


ctccacatct 


ttttcctcat catct, 


BF9 


gattgtggtg 


atggttgtag aa, 


BRIO 


gattgtggtg 


atggttgtag aa, 


BR14 


gatgatgagg 


aaaaagatgt ggag, 


BF15 


aaacccaaaa 


taacacagga catc, 


BF16 


agtgtaactt 


ctctctggtg, 


BF31 


taagcagatg 


taggtgatga gc, 


BF42 


gctgcttttg 


ttgtccactt c, 


BR43 


atagcttcct 


tccatctttg ag, 


CT2 


ctccacgttc 


ttccctctct act. 


2ApF 


gcgtgcagtg 


gaccattttt cagattta, 


lBpF 


cgctgcagca 


gccaccacat ttcgttg, 


3pR 


gcgtgcagat 


cgagcgttta tccatttg . 



5' RACE was undertaken using adapter- ligated mouse heart cDNA 
(Marathon-Ready, Clontech) , following the manufacturer's 
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protocol, using the supplied adapter primers with nested mouse 
utrophin primers UM83 (exon 4) and UM82 (exon 3) . Products 
were cloned in pGEM-T (Promega) . Human exon IB was isolated 
from skeletal muscle cDNA by PGR using mouse primers UBF and 
UM83. 5 ' RACE was used to clone the 5' end of human exon IB, 
using primers U107 and BR4. i Full-length utrophin RT-PCR was 
done as described (Blake, et al. (1996) J Biol Chem 271, 7802- 
7810.), but using Boehringer Expand Reverse Transcriptase and 
Long Template PCR reagents, and a primer annealing temperature 
of 59°C. Semi -quantitative RT-PCR was performed using primers 
BF42 and BR43 to amplify utrophin B, and commercial primers 
(Stratagene) to amplify glyceraldehyde- 3 -phosphate 
dehydrogenase (GAPDH) . Exponential amplification was 
established by withdrawing samples from thermal cycling at 1 
cycle intervals over a range of 5 cycles, predicted to span 
the exponential range following initial experiments in which 
samples were withdrawn at 5 cycle intervals . Products were 
blotted and probed with labelled BR4 or a 600bp GA3PH probe. 
Band intensities were quantified using a Storm phosphoimager , 
A graph of log 2 [band intensity] versus cycle number showed a 
linear relationship with gradient = 1, indicating near-perfect 
exponential amplification. The band intensities at any given 
cycle over this range are therefore directly proportional to 
the amount of cDNA in the original samples . 

Genomic Mapping and Clones 

Human YACs are as previously described (Pearce, et ai. (1993) 
Hum Mol Genet 2, 1765-72) . Southern blots of restriction 
digested YAC DNA were probed with end- labelled BR4 . A 3.0kb 
hybridising Xbal fragment was cloned from YAC 4X124H10 (a YAC 
clone which contains a human genomic DMA insert) into 
pBlueScript (Stratagene) generating pBSX2 . 0 . Mouse PACs were 
identified from the ' RPCI21 library. A 398bp exon IB/promoter B 
DNA probe (UB400) encompassing human positions 1129 to 1527 
was used for exon IB mapping. Library filters were screened 
with probes to exons 1A-5 (Dennis, et al . (1996) Nucleic Acid 
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Res 24, 164S-52) and UB400. Eleven PACs were identified, and 
four of these arranged into a contig by restriction mapping. 
An 8.0kb Xbal fragment from PAC 110C24, that hybridised with 
UB400, was cloned in pBlueScript generating pBSX8.0. 

Northern Blots and Probes 

A human multiple tissue northern blot and b-actin control cDNA 
probe were obtained from Clontech. A utrophin C- terminal cDNA 
probe, encompassing the last 4.0kb of the utrophin message, 
was generated by PCR. Human exon IB sequence between positions 
1480 and 1596 was cloned into pGEM-T and an exon IB antisense 
riboprobe was transcribed (In Vitro Transcription Kit, 
Promega) from the SP6 promoter following linearisation of the 
plasmid with Ncol. Hybridisation was carried out at 70 °C in 
50% formamide hybridisation buffer (Ausubel, et al . (1999) 
Current Protocols in Molecular Biology (Wiley) .) and the 
filter was washed at 75°C in O.lxSSC, 0.1%SDS for 2 hours. 

RNase Protection 

Specific probes spanning the exon IB/3 and exon 2A/3 
boundaries were obtained by PCR amplification of mouse heart 
cDNA using primers 2ApF, lBpP and 3pR. Products were cloned 
in the PstI site of pDPlB (Ambion) and sequenced. Plasmids 
were linearised with EcoRl (IB) or BamHl (2A) labelled 
antisense riboprobe was transcribed from the T7 promoter and 
gel purified. RNase protection was carried out using RPAIII 
kit (Ambion) following the manufacturer's instructions (30/ug 
total RNA unless stated, hybridisation temperature 42 °C, RNase 
A/Tl dilution 1:200) . Following electrophoretic separation, 
band intensities were quantified as above, and corrected for 
the amount of label present in each protected fragment. 

Promoter /Reporter Constructs 

Reporter constructs were generated by PCR amplification of the 
human sequence between positions 39 and 1503, using pBSX2 . 0 as 
template. Pfu polymerase was used with primers BF9 and BR14 . 
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Following 15 cycles of 96°C for 45 seconds, 62 e C for 45 
seconds, 72°C for 4 minutes, products were dA-tailed and 
cloned in pGEM-T. Clones were identified with product in both 
orientations and insert, liberated by digestion with 
5 Sacl/Ncol, was cloned into the Sacl/Ncol sites of a 
promoterless lucif erase reporter plasraid (pGL3 basic, 
Promega) , generating constructs with insert in forward 
(pGL3/utroB/F) and reverse (pGL3/UtroB/R) orientation with 
respect to the coding sequence of lucif erase. Deletions of the 
10 forward construct were generated by cleavage at Spel, Ndel, 

EcoRI and PvuII sites in the insert, followed by religation to 
sites in the 5' or 3' polylinker. Constructs were sequenced 
completely. 

Cell Culture and Transfections 

15 Three human cell lines (IN157 rhabdomyosarcoma (Nielsen et 
al., 1993, Mol Cell Endocrinol 93: 87-95), CL11T47 kidney 
epithelial and HeLa cervical epithelial (Cancer Research, 1952 
12: 264) were maintained as described (Dennis, et al. (1996) 
Nucleic Acid Res 24, 1646-52) . 2/xg pGL3/utroB/F or R, or its 

20 molar equivalent, mixed with 0.5/ig of LacZ control plasmid 
(pSV-p-gal, Promega) was transfected in each well of 6 well 
plates using Superfect (Qiagen) , following the manufacturer's 
protocol. 48 hours later, cells were harvested and cell 
extracts were assayed for lucif erase and (J-galactosidase 

25 activity as described (Dennis, et al. (1996) Nucleic Acids Res 
24, 1646-52) . Luciferase activity was standardised to p- 
galactosidase activity in each individual sample to control 
for transfection efficiency. Results are expressed as mean 
lucif erase/p-galactosidase ratio for four individual 

30 t rans feet ions . Error bars indicate the standard error of the 
mean. For comparison of different constructs within the same 
cell line, results were standardised to those obtained with 
pGL3/utroB/F and are expressed as % of this value. For 
comparison of constructs between cell lines, results were 

35 standardised to those obtained with a luciferase-SV40 
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promoter/ enhancer plasmid (pGL3 control, Promega) that 
generates high levels of reporter activity in all cell lines 
tested. 

Primer Extension 
5 Primer extension was carried out as described (18) ; end- 
labelled primer BR 2 was annealed to 0, 30 or 50/xg mouse heart 
total RNA at 58°C for 20 minutes, and extended at 42°C for 40 
minutes. Products were separated on a 6% polyacrylamide gel, 
under denaturing conditions, alongside a sequencing ladder 
10 generated from pBSX8 . 0 using primer BR2 . 

Results 

An alternative 5' exon in utrophin mRNA 

Utrophin from a mouse heart cDNA library was amplified by 

5' RACE, and the resulting products cloned and sequenced. Of 12 

15 clones, 8 contained novel sequence 5' of exon 3. Below, we 
present evidence that the novel sequence is a single 
alternative 5' exon of utrophin containing a translational 
initiation codon. We refer to this sequence as x exon IB' to 
distinguish it from the previously described 5' cDNA sequence 

20 comprising untranslated exon 1A and exon 2A which contains the 
translational start (Figure 5c) . 

Figure 3 shows a sequence comparison of human and mouse exon 
IB, and genomic flanking sequence. The position and phase of 
the splice junction at the 5' end of exon 3 is identical for 

25 both exon IB- and exon 2A- containing transcripts. Exon IB 

contains a putative ATG translation initiation codon and open 
reading frame, in- frame with that of exon 3, predicting a 
novel 31 amino acid N- terminus to the utrophin protein. The 
context of the ATG codon is predicted to be favourable for 

30 translation in that there is a purine at position -3 (bold in 
Figure. 3) (33) . Human and mouse exons IB show 82% nucleotide 
identity. The predicted translations are 84% identical and 94% 
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similar. The position and context of the ATG codon are 
conserved. The human sequence contains a second putative ATG 
codon immediately 5' (position 1511, solid bar in Figure. 1), 
followed by a TAG stop codon. As this ATG does not adhere to 
the Kozak consensus, is not associated with an open reading 
frame and is not present in the mouse sequence, we predict 
that this is not a functional translation start. A similar 
feature is present in human exon 2A, where the 5'UTR contains 
a short open reading frame prior to the true translation 
start. 

The transcript associated with exon IB 

A human multiple tissue northern blot was probed with an exon 
IB anti-sense riboprobe. A single hybridising 13kb band was 
observed, identical to that produced by probing the same blot 
with a cDNA encompassing 4kb of the utrophin C- terminus, 
indicating that exonlB is exclusively associated with a full- 
length utrophin mRNA. Exon IB is ubiquitously expressed, and 
appears most abundant in heart and pancreas, and least 
abundant in the brain, relative to p-actin. This is similar 
to the expression profile of total full-length utrophin, 

RT-PCR was employed to. confirm the association of exon IB with * 
a utrophin mRNA predicted to give rise to functional protein 
(Figure. 4). Amplification of first strand cDNA from IN157 
cells utilising a forward primer specific to exon IB (BF3I) and 
a reverse primer within the utrophin C- terminus ( CT2 ) produced 
a product of expected size. Successive hybridisation of this 
PGR product with domain- specif ic probes; U41, UBR4 , U107 and 
U16, confirmed that exon IB is associated with a utrophin 
transcript spanning the full coding sequence of the gene. 

The expression profiles of exons IB and 2A were examined using 
RNase protection. Specific riboprobes corresponding to the 
exon IB/3 and 2A/3 boundaries were simultaneously hybridised 
with total RNA, allowing direct quantitation of transcript 
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abundance, B-utrophin is the most abundant form in the heart, 
whereas exon 2 A- containing transcripts predominate in the 
kidney . Approximately equal amounts of exons IB and 2A were 
observed in the brain and in skeletal muscle. 

Mapping and cloning of genomic sequence associated with exon 
IB 

Using probe BR4, exon IB was mapped within our previously 
described human YAC contig (26) encompassing the 5' end of the 
utrophin locus (Figure. 5a) . A hybridising band was seen with 
YAC 4X124H10 but not 4X23E3 or 5C2 indicating that exon IB 
lies within the 120kb intron 2 of the utrophin gene. A 
subsequent database search identified a clone from the HGMP 
human chromosome 6 sequencing project, containing exons 1A, 2A 
and IB. This indicated that exon IB lies 52.2kb 3' of exon 2A 
(Figure. 5a) . Probing the mouse genomic PAC library (RPCI21 
from P. DeJong, Roswell Park Cancer Institute) with utrophin 
exons 1A, IB and 2- 5 inclusive identified a series of genomic 
PACs spanning the 5' end of the mouse utrophin gene. Four of 
these PACs were assembled into a contig of the region. 
Hybridisation with UB400 confirmed that exon IB lies within 
intron 2 in the mouse (Figure. 5b), approximately 50kb 3' of 
exon 2 . 

Human and mouse genomic fragments were obtained from the YAC 
and PAC libraries, respectively. Genomic sequence 
encompassing exon IB was obtained by an Xba I digest of YAC 
4X124H10 (human 3kb fragment) and PACll0c24 (mouse 8 . 8kb 
fragment) . These fragments were sub- cloned into pBluescript 
vector, the human fragment was deleted to 2kb during the sub- 
cloning. The plasmid clones were designated pBSX2 . 0 (human) 
and pBSX8 . 0 (mouse) . Comparison of the cDNA and genomic 
sequence showed no evidence of a further 5' exon in the 
transcript associated with exon IB, suggesting that the 
genomic flanking sequence contained the transcription start 
and promoter element responsible for exon IB expression. Our 
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nomenclature for utrophin 5' exons, transcripts and promoters 
appears in Figure 5c. 



Promoter B 

1.5kb of human genomic sequence 5' of exon IB, including the 
5 5'UTR of exon IB, was cloned in both orientations into a 
promoterless lucif erase reporter vector. Three human cell 
lines (IN157 rhabdomyosarcoma, CL11T47 kidney epithelial and 
HeLa cervical epithelial) were transiently transfected with 
these constructs. These three lines were chosen because they 

10 are known to express utrophin mRNA and protein at different 

levels. Reporter activity was detected at significantly higher 
levels in cells transfected with the forward than the reverse 
orientation construct, indicating promoter activity (Figure 
6) . Interestingly, the level of activity varied between cell 

15 lines by an order of magnitude. Semi-quantitative RT-PCR 
demonstrated that the variation of luciferase expression 
mimicked the transcription profile of endogenous utrophin" exon 
IB. In contrast, the GA3PDH control showed identical 
amplification in all cDNA samples, indicating that the 

20 differences seen in B-utrophin amplification have arisen from 
differences in the level of expression of the endogenous B- 
utrophin transcript in these cells lines . These data. show that 
the 1.5kb of genomic sequence 5' of exon IB utilised in these 
reporter clones contains the necessary signals to initiate 

25 transcription of exon IB, and regulatory elements that 
determine the level of expression in these cell lines. 



To further delineate important elements within this region, a 
series of 5' and 3' deletions of promoter B were made, and the 
in vitro activity of each one assayed (Figure 7) . A 300bp 
30 element, contained within clone pGL3/utroB/F/D5 ' Pvu 1199, 

retains 70% activity of the full 1.5kb construct in expressing 
cell lines, and shows 74% identity between human and mouse 
(Figure, 3) . Homology falls to 50% when sequence further 5' if 
the human Pvull site is compared with corresponding mouse 



WO 01/25461 



PCT/GBOO/03800 



44 

sequence using a 35bp window. Homology was determined using 
GAP, from version 20 of GCG, with default parameters as noted 
already above. 

Promoter B transcription start site 

The 5 ' ends of 8 human and 4 mouse 5 1 RACE clones clustered 
around a putative cap site in the genomic sequence (Figure. 3) . 
None of the 5 1 RACE clones generated by amplification across 
the exon 3/exon IB boundary extended further upstream. RT-PCR 
was carried out using forward primers around this region with 
a reverse primer in exon 4 . A product of expected size was 
amplified from IN157 cDNA by primers BP42 and BF8, but not 
BF16 or BF15, indicating that the transcription start is 
within the 18bp that separates the two primers BF15 and BF42 . 
These 18 bases contain the putative cap site and the cluster 
of RACE clone 5 * ends . 

To map the start site accurately, primer extension using an 
exon IB reverse primer and mouse heart RNA was employed. This 
yielded a single product, indicative of a single transcription 
start site. Transcription initiates at mouse position 1183 
within a 25-bp motif, which is 100% conserved between human 
and mouse. Part of this motif, spanning the cap site, is a -6/7 
base match for the initiator consensus, and correspondingly 
shows homology to the initiators of other genes . The 
transcription start site is homologous to the initiators of 
other promoters. Consensus 1, initiator consensus derived from 
sequence comparison of lnr + genes (Azizkhan, et al. (1993) 
Critical Reviews in Eukaryotic Gene Expression 3, 229-254.),- 
consensus 2, experimentally-derived consensus for functional 
initiator (Javahery, et al. (1994) Molecular and Cellular 
Biology 14, 116-127.); TdT, terminal deoxynucleotidyl 
transferase; hRAR, human retinoic acid receptor a; mCREB, 
mouse cAMP response element binding protein. Transcribed 
sequence is indicated in bold uppercase. We consider this 
promoter to be of the TATA"Inr + type. 
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Assaying for substances which modulate utrophin promoter 
activity 

Method 1; 

This method uses a mouse mdx-H2K myoblast line stably 
transfected with a human 7 .Okb utrophin promoter- lucif erase 
construct. On day 1 myoblast cells transfected with the 
construct are plated out in 6-well dishes, with compound or 
DMSO-only for the negative controls. 

4x6 well plates are used for every 3 compounds (the 
compounds are dissolved in DMSO and stored prior to use) . For 
example, compound A, or B, or C were each added to 1 well, 
while the remaining 3 wells contain only DMSO. This results 
in 4 wells containing each compound and 12 wells with DMSO 
alone. Due to the inherent noise of both the harvesting/assay 
and cell seeding/growth steps, this is the minimum number that 
results in meaningful analysis. Setting up the plates in this 
way means that the data really are paired, and can be analysed 
with a paired student T test. This provides a more powerful 
statistical analysis rather than putting each compound on a 
different plate and comparing it with a control plate. 

On Day 4 the cells are harvested and luciferase quantitation 
and pairwise analysis is carried out. 

Method 2: 

Compounds which up- regulate the endogenous utrophin promoter 
are be found using mdx-H2K myoblasts that are not transfected 
with the utrophin promoter-lucif erase construct. Mdx- 
myoblasts can be used to mimic utrophin transcprition and 
protein stability in dystrophin-def icient cells. 



Identification of at r ophic protein expression 

Quantitative Western Blotting is used to measure the level of 
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utrophin expression (Tinsley JM, et al . , Nature Medicine 4, 
1441-1444.) Using 6 well plates and treating with compound as 
described above generates enough total protein sample to test 
by Western blotting. Antibodies specific to the A protein or 
5 B protein are used to quantify levels of either protein. 

Identification of utrophin RNA expression 

Quantitative ribonuclease protection is used to analyse levels 
of utrophin expression. A pairwise design is used, as 
described above, but more cells are necessary. To see bands 
10 clearly, about 20-30/zg total RNA is used. Each compound and 

control will need a 175 cm 2 tissue culture flask. A dual probe 
to simultaneously identify the A transcript and B transcript 
is be used. 

Using the two techniques described compounds are identified 
15 after cell treatment which modulate utrophin levels. The same 
techniques are used for in vivo animal experiments where the 
compound is administered to dystrophin deficient mdx mice. 

Interleukin-6 (11.-6) Interactions 

Two related elements are present in the promoters of genes 
20 encoding acute phase proteins that mediate an increase in 
transcription stimulated by an IL-6 triggered signalling 
cascade (Hocke et al., 1992) . One of these was found to be 
present in Che exon IB flanking sequence. Wild type and 
mutated reporter fusions for IL-6 were therefore tested for 
25 responsiveness in appropriate cell systems . 

Constructs of the 1.5F B promoter normal and mutant (consensus 
change : ctggaa > gatatc 3 concerning the mutant : Hattori M et 
al (1990) Proc. Natl. Acad. Sci. USA. Mar ; 87 (6) : 2364-8 . ) were 
introduced into a promoter-less luciferase reporter vector and 
30 transfected into IN157 cells with a renilla firefly control. 
Cells were washed and charcoal stripped serum added 5 hours 
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post -transfect ion and left overnight. IL-6 amounts were added 
as illustrated with an appropriate amount of IL-6 soluble 
receptor. The cells were left for 24 hours and then assayed 
for activity using a luminometer. 



A dosage dependent transcriptional response was noted in the 
normal, but not the mutated reporter construct (figure 10) . 
This result indicates the existence of a cytokine mediated 
signalling pathway which causes up --regulation of the B utrophin 
promoter through the interaction of IL-6 and IL-6 receptor with 
the conserved IL-6 response element. 

Discussion 

We have demonstrated that there is a second promoter within 
intron 2 of the utrophin gene, driving expression of a unique 
first exon that splices into a common I3kb mRNA. These data are 
important, both in terms of understanding the molecular 
physiology of utrophin expression, and in view of their 
application to therapeutic intervention in DMD . 

The functional consequences of genes having more than one 
promoter have been postulated (reviewed in (Ayoubi, -et al 
(1996) FASEB J. 10,453-460). A single gene may achieve a 
complex temporal and spatial expression pattern by interaction 
of different promoters with discrete subsets of transcription 
factors. Dystrophin is an example: three dissimilar promoters 
are active at different levels in specific cell types within 
the heart, skeletal muscle and the brain (Gorecki, et al . 
(1992) Hum Mol Genet 1, 505-510., Barnea, et al. (1990) Neuron 
5, 881-888, Holder, et al. Human Genetics 97, 232-239) . 
Northern blot analysis, however, indicates that utrophin exon 
IB is ubiquitously expressed, implying that promoters A and B 
are co-expressed in many tissues. It is conceivable that 
examination of transcript distribution in whole tissue samples 
has masked cell type- specif ic patterns of expression. Data 
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from isolated human cell lines in vitro support this notion; 
we observed large differences in promoter B activity between 
different cell lines, consistent with an in vivo expression 
profile involving specific cellular populations. 
5 Alternatively, the two promoters may be spatially regulated at 
a sub-cellular level. Within adult skeletal muscle fibres, 
promoter A is synaptically driven (Gramolini, et al . (1997) J 
Biol Chem 272, 8117-20.), yet aggregates of utrophin mRNA are 
detectable at up to 25% extrasynaptic nuclei (Vater, et al. 
10 (1398) Molecular and cellular Neuroscience 10, 229-242) . 
Expression of promoter B in the extrasynaptic compartment 
might be invoked as one possible explanation. 

A second proposed function of alternative promoters is the 
generation of transcripts with interchangeable 5' exons, 
giving rise to trtRNAs with alternative 5'UTRs or proteins with 
novel N-terminal domains. Unlike exon IB, utrophin exon 1A 
contains a long GC-rich 5'UTR. In some transcripts, GC-rich 
5'UTRs are not translated efficiently (Kozak, M. (1991) J Cell 
Biol 115, 887-903.), and there are examples of genes in which 
alternative use of GC-rich and non-GC-rich 5'UTRs has been 
implicated in post-transcriptional regulation of protein 
synthesis (Nielson, et al. (1990) J Biol Chem 265, 13431- 
13434.) . In addition, the predicted 31 amino acidB encoded by 
exon IB are different to the 26 amino acids of exon 2A; the 
functions of the resulting N- termini may be different. 

The discovery of a second promoter provides a new target for 
the upregulation of utrophin to ameliorate the DMD phenotype . 
Promoter B is highly regulated, probably by different factors 
from promoter A, including IL-6. Elucidation of the mechanisms 
30 responsible for the large difference in promoter B activity 

between INI 5 7 and HeLa cells might lead to identification of a 
factor that can be delivered to muscle to activate utrophin 
expression. Importantly, as the N-box motif is absent from 
promoter B, this is unlikely to carry any risk of NMJ 
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disruption potentially inherent in the pharmacological 
manipulation of synaptically regulated promoter A. 
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CLAIMS 

1. An isolated nucleic acid comprising a promoter which 
comprises a sequence of nucleotides selected from (i) the 
human promoter sequence shown in Figure 1 and (ii) the mouse 

5 promoter sequence shown in Figure 2, free or substantially- 
free of utrophin coding sequence. 

2 . An isolated nucleic acid consisting essentially of a 
promoter which comprises the sequence of nucleotides shown 5' 
to position 1440 in Figure 1 . 

10 3 . An isolated nucleic acid consisting essentially of a 

promoter which comprises the sequence of nucleotides shown 5' 
to position 1183 of the mouse sequence shown in Figure 2. 

4. An isolated nucleic acid consisting essentially of a 
promoter which comprises the nucleotides numbered 1199 -1440 

15 in the sequence shown in Figure 1. 

5 . An isolated nucleic acid consisting essentially of a 
promoter which comprises the nucleotides numbered 959-1183 in 
the sequence shown in Figure 2 . 

6 . An isolated nucleic acid consisting essentially of a 
20 promoter which comprises the nucleotide sequence 

ACAGGACATCCCAGTGTGCAGTTCG . 

7. An isolated nucleic acid consisting essentially of a 
promoter which comprises a sequence of nucleotides that is an 

25 allele, mutant or derivative, by way of addition, insertion, 
deletion or substitution of one or more nucleotides, of the 
promoter sequence shown in Figure 1, which sequence has at 
least 60% homology with the promoter sequence shown in figure 
1 and which promoter, when operably linked to a sequence of 

30 nucleotides, has the ability to initiate transcription of that 
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sequence, said transcription being muscle- specif ic. 



8. An isolated nucleic acid consisting essentially of a 
promoter which comprises a sequence of nucleotides that is an 
allele, mutant or derivative, by way of addition, insertion, 

5 deletion or substitution of one or more nucleotides, of the 
promoter sequence shown in Figure 2, which sequence has at 
least 60% homology with the promoter sequence shown in figure 
2 and which promoter, when operably linked to a sequence of 
nucleotides, has the ability to initiate transcription of that 
10 sequence, said transcription being muscle- specif ic . 

9. An isolated nucleic acid consisting essentially of a 
promoter which comprises a sequence of nucleotides that is an 
allele, mutant or derivative, by way of addition, insertion, 
deletion or substitution of one or more nucleotides, of the 

15 promoter sequence shown in Figure 2, which hybridises to the 
promoter sequence shown in figure 2 under stringent 
hybridisation conditions and which promoter, when operably 
linked to a sequence of nucleotides, has the ability to 
initiate transcription of that sequence, said transcription 

20 being muscle-specific. 

10 . A nucleic acid construct comprising an isolated nucleic 
acid according to any of the preceding claims operably linked 
to a heterologous sequence. 

11. A nucleic acid construct according to claim 10 wherein 
25 the heterologous sequence is a coding sequence. 

12 . A nucleic acid construct according to claim 11 wherein 
the heterologous sequence encodes a reporter molecule. 

13 . A host cell comprising a nucleic acid construct 
according to any of claims 10 to 12 . 
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14. A method comprising culturing a host cell according to 
claim 13 under conditions for transcription of said 
heterologous sequence from the promoter. 

15. A method according to claim 14 wherein the heterologous 
5 sequence is a coding sequence and the host cell is cultured 

under conditions for expression of the encoded peptide or 
polypeptide product. 

16 . A method according to claim 14 or claim 15 comprising 
detection of transcription of the heterologous sequence. 

10 17 . A method according to claim 14 or claim 15 comprising 
detection of expression of the encoded peptide or polypeptide 
product . 

18 . A method of screening for a substance able to modulate 
utrophin promoter activity, the method comprising contacting 
15 an expression system containing a nucleic acid construct 

according to any of claims 10 to 12 with a test or candidate 
substance and determining transcription of the heterologous 
sequence or expression of the encoded peptide or polypeptide 
•product . 

20 19. A method according to claim 18 wherein the expression 
system comprises a host cell containing said nucleic acid 
construct . 

20. A method which comprises, following identification of a 
25 substance able to modulate utrophin promoter activity in 

accordance with a method according to claim 18 or claim 19, 
manufacture of the substance and/ or use of the substance in 
manufacture or formulation of a composition. 

21. The use of an isolated nucleic acid according to any of 
30 claims 1 to 6 for promoting transcription of an operably 
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linked sequence of nucleotides. 

22. The use of claim 21 wherein the transcription is 
tissue-specific, with the tissue-specificity being muscle- 
specific. 

5 23 . An isolated nucleic acid molecule comprising a 

nucleotide sequence encoding a polypeptide including the amino 
acid sequence shown in Figure 1 or Figure 2. 

24 . An isolated nucleic acid molecule comprising a 
nucleotide sequence encoding a polypeptide that is an allele, 

10 mutant or derivative of a polypeptide including the amino acid 
sequence shown in Figure 1, which amino acid sequence has at 
least 60% homology with the polypeptide sequence in Figure l 
or Figure 2 , 

25. An isolated nucleic acid molecule comprising a 

15 nucleotide sequence encoding a polypeptide that is an allele, 
mutant or derivative of a polypeptide shown in Figure l or 
Figure 2, which nucleotide sequence hybridises with the 
nucleotide sequence encoding the polypeptide in Figure 1 or 
Figure 2 under stringent hybridisation conditions. 

20 26 . An isolated nucleic acid molecule comprising a 

nucleotide sequence encoding a polypeptide having the amino 
acid sequence shown in Figure 9. 

27. An isolated nucleic acid molecule comprising the 
nucleotide sequence shown in figure 9. 

25 

28. Nucleic acid of any one of claims 23 to 27 comprised in 
a vector. 

29. Nucleic acid according to claim 28 wherein said vector 
is an expression vector. 
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30. A host cell containing heterologous nucleic acid 
according to any one of claims 23 to 29. 

31. A cell according to claim 30 which is a muscle cell. 

32. A cell according to claim 30 wherein said polypeptide 
5 is expressed. 

33. A cell according to any of claims 30 to 32 which is in 
a mammal. 

34. A non-human mammal having a cell according to any of 
claims 30 to 32. 

10 35. A non-human mammal containing nucleic acid according to 
any of claims 23 to 29 . 

36. A method including introduction of nucleic acid 
according to any of claims 23 to 29 into a cell. 

37. A method according to claim 36 wherein said 
15 introduction takes place in vitro. 

38. A method which includes causing or allowing expression 
of the coding nucleotide sequence of heterologous nucleic acid 
according to any of claims 23 to 29 in a cell. 

39. A method according to claim 38 wherein the cell is part 
20 of a mammal. 

40. A method according to claim 38 wherein the expression 
product is purified and/or isolated following expression. 



25 



41. A method according to claim 40 wherein the expression 
product is formulated into a composition which includes at 
least one additional component , following purification and/or 
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isolation of the expression product. 
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42. An isolated polypeptide as encoded by nucleic acid 
according to any of claims 23 to 29. 

43. An isolated utrophin exon IB polypeptide selected from: 
5 (i) human utrophin exon IB polypeptide of which the amino 

acid sequence is shown in Figure 1; 

(ii) mouse utrophin exon IB of which the amino acid sequence 
is shown in Figure 1. 

10 44. An isolated polypeptide including the human polypeptide 
according to claim 43 . 

45. An isolated polypeptide including the mouse polypeptide 
according to claim 44. 

46. An isolated polypeptide which has 60 % homology with 
15 the polypeptide according to claim 44 or 45. 

47. An isolated fragment of a polypeptide according to 
claim 43, which fragments is 5 to 25 amino acids in length. 

48. An isolated fragment of a polypeptide according to 
claim 43, which fragment is 10 to 20 amino acids in length. 

20 49 . An antibody specific for a polypeptide according to any 
one of claims 42 to 48. 

50. A composition including a polypeptide according to any 
one of claims 42 to 46, a fragment according to claim 47 or 
claim 48, or an antibody according to claim 49, and a 

25 pharmaceutically acceptable excipient. 

51. Use of nucleic acid according to any of claims 23 to 29 
in the manufacture of a medicament for treating a dystrophin 
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phenotype in a mammal. 

52. Use of a polypeptide according to any of claims 42 to 
48 or an antibody according to claim 49 in the manufacture of 
a medicament for treating a dystrophin phenotype in a mammal. 
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Human B-utrophin up to nucleotide 1500. deduced translation 

CCCASTGTGCAGTTCGAAGGCTGCTTTTGTTGTCCACTTCCTCCACOTCTTTTTCCTCAT 
1 + + 1 ___ + + + ( 

GGGTCACACGTCAAGCTTCCGACGAAAACAACAGCTGAAGGAGGTGTAGAAAAAGGACTA 
CATCTAAGCAGATGTAGGTGATGAGCGGCCTGGCAGCCACCACGTTTCATTGGAAAAAGT 



GTAGATTCGTCTACATCCACTACTCGCCGGACCGTCGGTGGTGCAAAGTAACCTTTTTCA 

MSG LAATTFHWKKC- 



GCAGATTGGATTTGCCASSGCATGTAGCTCTCCAGGCTTGCAAGCGATTACCAG \TGAAC 
! + + + + + _ + 18( 

CGTCTAACCTAAACGGTCCCGTACATCGAGAGGTCCGAACGTTCGCTAATGGTCfACTTG 
RLDLPGHVALQACKRLPDEH- 

ACAATGACGTACAGAAGAAAACCTTTACCAAATGGATAAATGCTCGATTTTCAAAGAGTG 

1 + + + „, 

TGTTACTGCATGTCTTCTTTTGGAAATGGTTTACCTATTTACGAGCTAAAAGTTTCTCAC 

NDVQKKTFT KWINARFSKSG- 

GGAAACCACCCATCAATGATATGTTCACA6ACCTCAAAGATGGAAGGAAGCTATTGGATC 
1 + + + + + + 30( 

CCTTTGGTGGGTAGTTACTATACAAGTGTCTGGAGTTTCTACCTTCCTTCGATAACCTAG 

KPPINDMFT DLKDGRKLLDL- 

TTCTAGAAGGCCTCACAGGAACATCACTGCCAAAGGAACGTGGTTCCACAAGGGTACATG 

X +— + + + + • + 36< 

AAGATCTTCCGGAGTGTCCTTGTAGTGACGGTTTCCTTGCACCAAGGTGTTCCCATGTAC 

LEGLTGTSL PKERGSTRVHA- 

CCTTAAATAACGTCAACAGAGTGCTGCAGGTTTTACATCAGAACAATGTGGAATTAGTGA 
J + + + + + _. + <2( 

GGAATTTATTGCAGTTGTCTCACGACGTCCAAAATGTAGTCTTGTTACACCTTAATCACT 

LNNVNRVLQ VLHQNHVELVN- 
ATATAGGGGGAACTGACATTGTGGATGGAAATCACAAACTGACTTTGGGGTTACTTTGGA 

I + + i_ + ^ ___ + + 4g( 

TATATCCCCCTTGACTGTAACACCTACCTTTAGTGTTTGACTGAAACCCCAATGAAACCT 
IGGTDIVDGHHKLTLGLLMS- 

GCATCATTTTGCACTGGCAGGTGAAAGATGTCATGAAGGATGTCATGTCGGACCTGCAGC 

1 + + + + S4( 

CGTAGTAAAACGTGACCGTCCAC1TTCTACAGTACTTCCTACAGTACAGCCTGGACGTCG 
I I LHWQVKD VMKDVMS DLQQ- 

AGACGAACAGJGAGAAGATCCTGCTCAGCTGGGTGCGTCAGACCACCAGGCCCTACAGCC 
I + + + + + + g0( 

TCTGCTTGTCACTCTTCTAGGACGAGTCGACCCACGCAGTCTGGTGGTCCGGGATGTCGG 
THSEKILLSWVRQTTRPYSQ- 

AAGTCAACGTCCTCAACTTCACCACCAGCTGGACAGATGGACTCGCCTTTAATGCTGTCC 

I + + — — + + — + 66( 

TTCAGTTGCAGGAGTTGAAGTGGTGGTCGACCTGTCTACCTGAGCGGAAATTACGACAGG 

VNVLHFTTS WTDGLAFNAVL- 
TCCACCGACATAAACCTGATCTCTTCRGCTGGGATAAAGTTGTCAAAATGTCACCAATTG 

I _ + + „ „ t + __._ + 72( 

AGGTGGCTGTATTTCGACTAGAGAAGTCGACCCTATTTCAACAGTT'rTACAGTGGTTAAC 

HRHKPDLFSHOKVVKMSPIE- 

Figure 8 



WO 01/25461 



PCT/GBOO/03800 



AGW»CTTCAACAT5CCTTCAGCAAGGCTCAAACTTATTtG6GAATTGAAAAGCTGTtAG 

1 *- + + -+ --+ + 7 

TCTCTGAACTTGTACGGAAGTCGTTCCGAGTTTGAATAAACCCTTAACTTTTCGACAATC 

RLEHAFSKAOTYLGIEKLLD- 

ATCCTGAAGATGTTGCCGTTCGGCTTCCTGACAAGAAATCCATAATTATGTATTTAACAT 

, ► -+--- ♦ * + a 

TAGGACTTCTACAACGGCAAGCCGAAGGACTGTTCTTTAQGTATtAATACATAAATTGTA 

pedvavrlpdkksiimylts- 
:tcagcaagtcaccatagacgccatccgtgaggtagagaca 
gaaacaaactccacgatggagtcgttcagtggtatctgcggtaggcactccatctctgtg 

LFEVtPQQVTI.DAIREVETli- 

TCCCAAGGAAATATAAAAAAGAATGTGAAGAAGAGGCAATTAATATACAGAGTACAGCGC 

1 + + + + — — + + 96C 

AGGGTTCCTTTATATTTTTTCTTACACTTCTTGTCCGTTAATTATATGTCTCATGTCGCG 

PRKYKKECEEEAINIQSTAP- 

CTGAGGAGGAGCATGAGAGTCCCCGAGCTGAAACTCCCAGCACTGTCACTGAGGTCGACA 

jj + + + +-- - + + 102 

GACTCCTCCTCGTACTCTCAGGGGCTCGACTTTGAGGGTCGTGACAGTGACTCCAGCTGT 

EEEHESPRAETPSTVTEVDM- 

TGGATCTGGACAGCTATCAGATTGCGTTGGAGGAAGTGCTGACCTGGTTGCTTTCTGCTG 

i + + +_ +-___„„__„+ + 1QB 

ACCTAGACCTGTCGATAGTCTAACGCAACCTCCTTCACGACTGGACCAACGAAAGACGAC 



TCCTGTGAAAGGTCCTCGTCCTACTATAAAGACTACTACAACTTCTTCAGTTTCTGGTCA 
DTFQEQDDISDDVSEVKDQF- 



AACGTTGGGTACTTCGAAAATA 

ATHEAFMHELTAHQS 



AGGACGTCCGTCCGTTGGTTGACTATTGTGTTCCTTGAGACAGTCTG 
IiQAGNOtilTQGtLSD 



TCTAAGTCCTTGTCTACrGGSACGACTfACGATCTACCCTCCGAGAATCCCACCTCTCAT 




ACCTACTACTACATTTTAGAGATGTTTTCGACGATCTTCTTGTATTTTCAAACGTTTCAC 
ODDVKSLGKLLEEHKSLQ 
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Sequence Range: 1 to 6059 

10 20 30 40 50 SO 70 80 

ACTAGTCAAG ATGAGCGGCC TGGCAGCCAC CACGTTTCAT TGGAAAAAGT GCAGATTGGA TTTGCCAGGG CATGTAGCTC 
MSG L A AT TFH WKK CRLD LPG H V A» 

90 100 110 120 130 140 150 160 

TCCAGGCTTG CAAGCGATTA CCAGATGAAC ACAATGATGT ACAGAAGAAA ACCTTTACCA AATGGATAAA CGCTCGATTT 
L Q A C KRL PDE HNDV QKK TFT KWIN ARF> 

170 180 190 200 210 220 230 240 

TCCAAGAGTG GGARACCACC CATCAGTGAT ATGTTCTCAG ACCTCAAAGA TGGGAGAAAG CTCTTGGATC TTCTCGAAGG 
SKS GKPP ISD MFS DLKD GRK LLD LLBGj 

250 260 270 280 290 30O 310 320 

CCTCACAGGA ACATCATTGC CAAAGGAACG TGGTTCCACA AGGGTGCATG CCTTAAACAA TGTCAACCGA GTGCTAGAGS 
LTG TSL PKER GST RVH ALNN VMR V J, Q> 

330 340 350 360 370 380 390 400 

TTTTACATCA GAACAATGTG GACTTGGTGA ATATTGGAGG CACGGACATT GTGGCTGGAA ATCCCAAGCT GACTTTAGGG 
VLHQ NKV DLV NIGG TD1 V A G HPKt, T i< G» 

410 420 430 440 450 460 470 480 

TTACTCTGGA GCATCATTCT GCACTGGCAG GTGAAGGATG TCATGAAAGA TATCATGTCA GACCTGCAGC AGACAAACAG 
LLW SIIL HWQ VKD VMKD IMS DLQ QTHS> 

490 500 510 520 530 540 550 560 

CGAGAAGATC CTGCTGAGCT GGGTGCGGCA GACCACCAGG CCCTACAGTC AAGTCAACGT CCTCAACTTC ACCACCAGCT 
EKI LLS W V R Q TTR PITS QVNV L If F TTS> 

570 S80 590 600 610 620 630 640 

GGACCGATGG ACTCGCGTTC AACGCCGTGC TCCACCGGCA CAAACCAGAT CTCTTCGACT GGGACGAGAT GGTCAAAATG 
WIDG LAP HAV LHRH KPD LFD WCEM V K M> 

650 660 670 680 690 700 710 720 

TCCCCAATTG AGAGACTTGA CCATGCTTTT GACAAGGCCC ACACTTCTTT GGQAATTGAA AAGCTCCTAA GTCCTGAAAC 
SPI E R L D HAF DKA HTSL 0 I E KLL SPBT> 

730 740 750 760 770 780 790 800 

TOTTGCTGTG CATCTCCCTG ACAAGAAATC CATAATTATG TATTTAACGT CTCTGTTTGA GOTGCTTCCT CAGCAAGTCA 
VAV HLP DKKS IIM YLT SLFE V L P Q Q V> 

810 820 830 840 850 860 870 880 

CGATAGATGC CATCCGAGAG GTGGAGACTC TCCCAAGGAA GTATAAGAAA GAATGTGAAG AGGAAGAAAT TCATATCCAG 
T I D A IRE VET CPRK If K K ECE E E E I HIQ> 

B90 900 910 920 930 940 950 960 

AGTGCAGTGC TGGCAGAGGA AGGCCAGAGT CCCCGAGCTG AGACCCCTAG CACCGTCACT GAAGTGGACA TGGATTTGGA 
SAV LAEE GQS PRA ETPS TVT EVD MOLD* 

970 980 990 1000 1010 102O 1030 1040 

CAGCTACCAG ATAGCGCTAG AGGAAGTGCT GACGTGGCTG CTGTCCGCGG AGGACACGTT CCAGGAGCAA CATGACATTT 
SYQ I A L EEV1, TWL LSA EDTF QEQ HDI> 

1050 1060 1070 loeo 1090 noo mo 1120 

CTGATGATGT CGAAGAAGTC AAAGAGCAGT TTGCTACCCA TGAAACTTTT ATGATGGAGC TGACAGCACA CCAGAGCAGC 
SDDV EBV KBQ FATH ETF MME LTAH Q S S> 

1130 1140 1150 1160 1170 1180 1190 1200 

GTGGGGAGCG TCCTGCRGGC TGGCAACCAG CTGATGACAC AAGGGACTCT GTCCRGAGAG GAGGAGTTTG AGATCCAGGA 
VGS VLQA GNQ LMT QGTL S RE EEF EIQE> 

1210 1220 1230 1240 1250 1260 1270 1280 

ACAGATGACC TTGCTGAATG CAAGGTGGGA GGCGCTCCGG GTGGAGAGCA TGGAGAGGCA GTCCCGGCTG CACGACGCTC 
Q M T LLN ARWE ALR VES MERQ SRL HPA> 

1290 1300 1310 1320 1330 1340 1350 1360 

TGATGGAGCT GCAGAAGAAA CAGCTGCAGC AGCTCTCAAG CTGGCTGGCC CTCACAGAAG ABCGCATTCA GAAGATGGAG 
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SLP LGDD LP S LQK L L Q E HKS h Q H 



L E D Q L Q 



HIS I L 



DKV QTS NFKD QKE L S V S V R R LA 



Q L L S N P K» 



L V Q R L 



E T V H V> 



PL T K 



ELGE NLQ ELR DLTQ 



L N R t B L E 



BR D K I S E 



OGAGCCCAGT 

WHT WKK ICRS VPT TLK ECIQ EPS> 

2390 2400 
. TCTGCGTCAG ATATTCCTGT 
S V S QTRI A A H PHV QKVV LVS SAS D I P V> 

3410 2420 2430 2440 2450 2460 2470 2480 

TCAOTCrCAr CGTACTTCGG AAATTTCAAT TCCTGCTGAT CTTGATAAAA CTATAACAGA ACTAGCCGAC TGGCTGGTAT 
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2730 2740 2750 2760 2770 2780 2790 2800 

TTGAGCTAAG ACAGCAGCAG CTTGAGGACA TGATTATTGA CAGTCTTCAG TGGGATGACC ATAGGGAGGA GACTGAAGAA 
V E L R QQQ LE D HI I D SLQ WDD HRB E TEE> 

28" 2820 2B30 2840 2850 2660 2870 2880 

CTGATGAGAA AATATGAGGC TCGACTCTAT ATTCTTCAGC AAGCCCGACG GGATCCACTC ACCAAACAAft TTTCTGATAA 
L M R K Y B A R L Y I L Q Q A R R DPL T K Q I S D N> 

2890 2900 2910 2920 2930 2940 2950 2960 

-»r-m CTTCAAGAAC TGGGTCCTGG AGATGGTATC GTCATGGCGT TCGATAACGT CCTGCAGAAA CTCCTGGAGG 
LQE LGPG DGI VMA PDNV L Q K LLE> 

2970 2S80 2990 3000 3010 30 20 3030 3040 

AATATGOGAG TGATGACACA AGGAATCTGA AAGAAACCAC AGAGTACTTA AAAACATCAT GGATCAATCT CAAACAAAGT 
BYGS DDT RHV KETT BYL KTS W I N L KQS> 

30BQ 3090 3100 3110 3120 

GAGTGGAGGA CGGTGCAGGC CTCTCGCAGA GATCTGGAAA ACTTCCTGAA 
RQHA LEA EWR TVQA SRR D Ii E NPLK> 

3 «° 3140 3150 3160 3170 3180 3190 3200 

GTGGATCCAA GAAGCAGAGA CCACAGTGAA TGTGCTTGTG QATGCCTCTC ATCGGGAGAA TGCTCTTCAG GATAGTATCT 
W I Q EAE TTVN V L V DAS H R K H ALQ DSI> 

3210 3220 3230 3240 3250 3260 3270 3280 

TGGCCAGGGA ACTCAAACAG CAGATGCAGG ACATCCAGGC AGAAATTGAT GCCCACAATG ACATATTTAA AAGCATTGAC 
LARE LKQ Q M Q D I Q A EID AHN B I F K SID> 

3290 3300 3310 3320 3330 3340 3350 3360 

GGAAACAGGC AGAAGATGGT AAAAGCTTTG GGAAATTCTG AAGAGGCTAC TATGCTTCAA CATCGACTGG ATGATATGAA 
GNR QKMV K A L GNS BEAT MLQ - - 



D D M N> 

3370 3380 3390 3400 3410 3420 3430 3440 

CCAAAGATGG AATGACTTAA AAGCAAAATC TGCTAGCATC AGGGCCCATT TGGAGGCCAG CGCTGAGAAG TGGAACAGGT 
QBW N D L KAKS ASI RAH LEAS A B K WHR> 

3450 3"60 3470 34BO 3490 3500 3S10 3520 

TGCTGATGTC CTTAGAAGAA CTGATCAAAT GCCTGAATAT GAAAGATGAA GAG CTTAAG A AACAAATGCC TATTGGAGGA 
LLMS LEE LIK WLNM KDE ELK KQMP IGO> 

3530 3540 3550 3560 3570 3580 3590 3600 

GATGTTCCAG CCTTACAGCT CCAGTATGAC CATTGTAAGG CCCTGAGACG GGAGTTAAAG GAGAAAGAAT ATTCTGTCCT 
DVP ALQL QYD HC-K ALRR ELK EKE Y S V L> 

3610 3620 3630 3640 3650 3660 3670 3680 

GAATGCTGTC GACCAGGCCC GAGTTTTCTT GGCTGATCAG CCAATTGAGG CCCCTGAAGA GCCAAGAAGA AACCTACAAT 
N A V DQA R V F L ADQ PIE APBE PRR KLO> 

3690 3700 3710 3720 3730 3740 3750 3760 

CAAAAACAGA ATTAACTCCT GAGGAGAGAG CCCAAAAGAT TGCCAAAGCC ATGCGCAAAC AGTCTTCTGA AGTCAAAGAA 
LTP EER AQKI AKA MRK QSS E V K E> 

3770 3780 3790 3800 3810 3820 3830 3840 

AAATGGGAAA GTCTAAATGC TGTAACTAGC AATTGGCAAA AGCAAGTGGA CAAOGCATTG GAGAAACTCA GAGACCTGCA 
SLNA VTS MWQ KQVD K A L EKL R D L Q> 

3B50 3860 3870 3880 3890 3900 3910 3920 

GGGAGCTATG GATGACCTGG ACGOTGACAT GAAGGAGGCA GAGTCCGTGC GGAATGGCTG GAAGCCCGTG GGAGACTTAC 
GAM DDL DADM K E A E S V RNGW KPV GDL> 

3930 3940 3950 3960 3970 3980 3990 4000 

TCATTGACTC GCTGCAGGAT CACATTGAAA AAATCATGGC ATTTAGAGAA GAAATTGCAC CAATCAACTT TAAAGTTAAA 
LIDS LQD HIE KIMA PRE E I A P I N F K V K> 

4030 4040 4050 4060 4070 4080 

iCTGTCT CCACTTGACC TGCATCCCTC TCTAAAGATG TCTCGCCAGC TAGATGACCT 



Figure 9 coat .. 



WO 01/25461 



PCT/GBOO/03800 



13/15 



T V N DLSS Q h S P L D LHPS L K M SRQ tDDL> 

4090 4100 4110 4120 4130 4140 4150 4160 

TAATATGCGA TGGAAACTTT TACAQGTTTC TGTGGATGAT CGCCTTAAAC AGCTTCAGGA AGCCCACAGA GATTTTGGAC 
NMR WKL LQVS VDD RLK QLQE AHR DPG> 

4170 4180 4190 4200 4210 4220 4230 4240 

CATCCTCTCA GCATTTTCTC TCTACGTCAC TCCAGCTGCC GTGGCAAAGA TCCATTTCAC ATAATAAAGT GCCCTATTAC 
PSSQ HFL STS VQLP KQR SIS H N K V P Y Y> 

4250 4260 4270 4280 4290 4300 . 4310 4320 

AAACACAGAC CACCTGTTGG GACCATCCTA AAATGACCGA ACTCTTTCAA TCCCTTGCTG ACCTGAATAA 
QTQT TCW DHP KMTE L F Q SLA DLKN> 

4330 4340 4350 4360 4370 4380 4390 4400 

TGTACGTTTT TCTGCCTACC GTACAGCAAT CAAAATCCGA AGACTACAAA AAGCACTATG TTTGGATCTC TTAGAGITGA 
VRF SAY RTAI KIR RLQ K A L C LDL L E Ij> 

4410 4420 4430 4440 4450 4460 4470 44B0 

GTACAACAAA TGAAATTTTC AAACAGCACA AGTTGAACCA AAATGACCAG CTCCTCAGTG TTCCAGATCT CATCAACTGT 
S T T N E I F KQH KLNQ N D Q L L S V P D V I N C> 

44 3° 4S00 4510 4520 4530 4540 4550 4560 

CTGACAACAA CTTATGATGG ACTTGAGCAA ATGCATAAGO ACCTGGTCAA CGTTCCACTC TGTCTTGATA TGTGTCTCAA 
LTT TY DG L E Q MHK DLVN VPL CVD M G t, N> 

4570 4580 4S90 4600 4610 4620 4630 4640 

TTGGTTGCTC AATGTCTATG ACACGGGTCG AACTGGAAAA ATTAGAGTGC AGAOTCTGAA GATTGGATTA ATGTCTCTCT 
WLL N V Y DTGR TGK 1 R V OSLK I Q L MSL> 

4670 4680 4690 4700 4710 4720 

ACAGAT ATCTCTTTAA GOAAGWGCO GGGCCGACAG AAATGTGTGA CCAGAGGCAG 

SKGL I. E E KYR Y h P K EVA GPT E M C D Q R Q> 

4730 4740 4750 4760 4770 4780 4790 4800 

CTGGGCCTGT TACTTCATCA TGCCATCCAG ATCCCCCGGC AGCTAGGTGA AGTAGCAGCT TTTGGAGGCA GTAATATTCA 
LGL LLHD A I Q IPR QLGE V A A PGG SNIE> 

4810 4820 4830 4840 4B50 4860 4870 4880 

GCCTAGTGTT CGCAGCTGCT TCCAACAQAA TAACAATAAA CCAGAAATAA GTGTGAAAGA GTTTATAGAT TGGATGCATT 
PSV RSC FQQ.N NH .K PEI SVKE FID WMH> 

4890 4900 4910 4920 4930 4940 4950 49B0 

TGGAACCACA GTCCATGCIT TGGCTCCCAG TTTTACATCG AGTGGCAOCA GCGGAGACTG CAAAACATCA GGCCAAATGC 
DEPQ SMV-W.LP VLHR VAA AET AKHQ AKC> 

4970 4980 4990 5000 501O 5020 5030 5040 

AACATCTGTA AAGAATGTCC AATTGTCGGG TTCAGGTATA GAAGCCTTAA GCATTTTAAC TATGATGTCT GCCAGAGTTG 
NIC KECP IVG FRY RSLK H F H Y D V C Q S C> 

5050 5060 5070 5080 5090 5100 5110 5120 

TTTCTTTTCG GGTCGAACAG CAAAAGGTCA CAAATTACAT TACCCAATGG TGGAATATTG TATACCTACA ACATCTGQGG 
FFS GRT AKGH KLH VPM VEYC IPT TSG> 

S130 5140 5150 5160 5170 5180 5190 5200 

AAGATGTACG AGACTTCACA AAGGTACTTA AGAACAAGTT CAGGTCGAAG AAGTACTTTG CCAAACACCC TCQACTTGGT 
EDVR DFT KVL KNKF RSK KYF AKHP RLG> 

5210 5220 5230 S240 5250 5260 5270 

TACCTGCCTG TCCAGACAGT TCTTGAAGGT GACAACTTAG AGACTCCTAT CACACTCATC AGTATGTGGC 
YLP VQTV LEG D N L. ETPI T L I SMW PEHY> 

5290 5300 5310 5320 5330 5340 5350 5360 

TGACCCCTCA CAATCTCCTC AACTGTTTCA TGATGACACC CATTCAAGAA TAGAACAATA TCCCACACGA CTGGCCCAGA 
DPS QSP QLFH DDT HSR IEQY ATR LAQ> 

5370 5380 5390 5400 5410 5420 5430 5440 

TGGAAAGGAC TAATGGGTCT TTTCTCACTG ATAGCAGCTC CACCACAGGA AGTGTGGAAG ACGAGCACGC CCTCATCCAG 
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MERT KQS F L T DSSS T T G S V E DEHA L I Q> 

5450 5460 5470 5480 5490 5S00 5510 5520 

CAGTATTGCC AAACACTCGG AGGAGAGTCC CCAGTGAGCC AGCCGCAGAG CCCAGCTCAG ATCCTGAAGT CAGTAGAGAG 
QYC Q T L G GES PVS QPQS PAQ ILK S V E R> 

5S30 5540 5550 5560 5570 5580 5590 5600 

GGAAGAACGT GGAGAACTGG AGAGGATCAT TGCTGACCTG GAGGAAGAAC AAAOAAATCT ACAGGTGGAG TATGAGCAGC 
E E R GBL. B R I I A D L EEE QRNL QVE YEQ> 

5610 5620 5630 5640 5650 5660 5670 56B0 

TGAAGGACCA GCACCTCCGA AGGGGGCTCC CTGTCGGTTC ACOGCCAGAG TCGATTATAT CTCCCCATCA CACGTCTGAG 
L K D Q HLR R G L PVGS PPE SII SPHH TSE> 

5690 S700 5710 5720 5730 5740 5750 5760 

GATTCAOAAC TTATAGCAGA AGCAAAACTC CTCAGGCAGC ACAAAGGTCG GCTGGAGGCT AGGATGCAGA TTTTAGAAGA 
DSE LIAE A K L L R Q HKGR LEA RMQ I L E D> 

5770 5780 5790 5800 5810 5820 5830 5640 

TCACAATAAA CAGCTGGAGT CTCAGCTCCA CCGCCTCCGA CAGCTGCTGG AGCAGCCTGA ATCTGATTCC CGAATCAATG 
HNK QLE SQLH RLR QLL EQPE SDS R I N> 

5850 5860 5870 S880 5890 5900 5910 5920 

GTGTTTCCCC ATGQGCTTCT CCTCAGCATT CTGCACTGAG CTACTCGCTT GATCCAGATG CCTCCGQCCC ACAGTTCCAC 
OVSP WAS PQH SALS Y S h DPD A S G P Q. F H> 

S930 S940 5950 5960 5970 59B0 5990 6000 

CAGGCAGCGG GAGAGGACCT GCTGGCCCCA CCGCACGACA CCAGCACGOA TCTCACGGAG GTCATGGAGC AGATTCACAG 
0. A A GBDL LAP PHD TSTD LTE VME Q. I H S> 

6010 6020 6030 6040 6050 

CACGTTTCCA TCTTGCTGCC CAAATGTTCC CAGCAGGCCA CAGGCAATGT AATCACTAG 
T F P SCC PNVP SRP QAM *> 
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Figure 10 
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